[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Zakhar Kirpichenko
Thank you, Igor. I was just reading the detailed list of changes for
16.2.14, as I suspected that we might not be able to go back to the
previous minor release :-) Thanks again for the suggestions, we'll consider
our options.

/Z

On Fri, 20 Oct 2023 at 16:08, Igor Fedotov  wrote:

> Zakhar,
>
> my general concern about downgrading to previous versions is that this
> procedure is generally neither assumed nor tested by dev team. Although is
> possible most of the time. But in this specific case it is not doable due
> to (at least) https://github.com/ceph/ceph/pull/52212 which enables 4K
> bluefs allocation unit support - once some daemon gets it - there is no way
> back.
>
> I'm still thinking that setting "fit_to_fast" mode without enabling
> dynamic compaction levels is quite safe but definitely it's better to be
> tested in the real environment and under actual payload first. Also you
> might want to apply such a workaround gradually - one daemon first, bake it
> for a while, then apply for the full node, bake a bit more and finally go
> forward and update the remaining. Or even better - bake it in a test
> cluster first.
>
> Alternatively you might consider building updated code yourself and make
> patched binaries on top of .14...
>
>
> Thanks,
>
> Igor
>
>
> On 20/10/2023 15:10, Zakhar Kirpichenko wrote:
>
> Thank you, Igor.
>
> It is somewhat disappointing that fixing this bug in Pacific has such a
> low priority, considering its impact on existing clusters.
>
> The document attached to the PR explicitly says about
> `level_compaction_dynamic_level_bytes` that "enabling it on an existing DB
> requires special caution", we'd rather not experiment with something that
> has the potential to cause data corruption or loss in a production cluster.
> Perhaps a downgrade to the previous version, 16.2.13 which worked for us
> without any issues, is an option, or would you advise against such a
> downgrade from 16.2.14?
>
> /Z
>
> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov  wrote:
>
>> Hi Zakhar,
>>
>> Definitely we expect one more (and apparently the last) Pacific minor
>> release. There is no specific date yet though - the plans are to release
>> Quincy and Reef minor releases prior to it. Hopefully to be done before the
>> Christmas/New Year.
>>
>> Meanwhile you might want to workaround the issue by tuning
>> bluestore_volume_selection_policy. Unfortunately most likely my original
>> proposal to set it to rocksdb_original wouldn't work in this case so you
>> better try "fit_to_fast" mode. This should be coupled with enabling
>> 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
>> good spec on applying this mode to BlueStore attached to
>> https://github.com/ceph/ceph/pull/37156.
>>
>>
>> Thanks,
>>
>> Igor
>> On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>>
>> Igor, I noticed that there's no roadmap for the next 16.2.x release. May
>> I ask what time frame we are looking at with regards to a possible fix?
>>
>> We're experiencing several OSD crashes caused by this issue per day.
>>
>> /Z
>>
>> On Mon, 16 Oct 2023 at 14:19, Igor Fedotov  wrote:
>>
>>> That's true.
>>> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>>>
>>> Many thanks, Igor. I found previously submitted bug reports and
>>> subscribed to them. My understanding is that the issue is going to be fixed
>>> in the next Pacific minor release.
>>>
>>> /Z
>>>
>>> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov 
>>> wrote:
>>>
 Hi Zakhar,

 please see my reply for the post on the similar issue at:

 https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/


 Thanks,

 Igor

 On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
 > Hi,
 >
 > After upgrading to Ceph 16.2.14 we had several OSD crashes
 > in bstore_kv_sync thread:
 >
 >
 > 1. "assert_thread_name": "bstore_kv_sync",
 > 2. "backtrace": [
 > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
 > 4. "gsignal()",
 > 5. "abort()",
 > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
 > const*)+0x1a9) [0x564dc5f87d0b]",
 > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
 > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
 > const&)+0x15e) [0x564dc6604a9e]",
 > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
 unsigned
 > long)+0x77d) [0x564dc66951cd]",
 > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
 > [0x564dc6695670]",
 > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
 > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
 > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
 > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
 > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Igor Fedotov

Zakhar,

my general concern about downgrading to previous versions is that this 
procedure is generally neither assumed nor tested by dev team. Although 
is possible most of the time. But in this specific case it is not doable 
due to (at least) https://github.com/ceph/ceph/pull/52212 which enables 
4K bluefs allocation unit support - once some daemon gets it - there is 
no way back.


I'm still thinking that setting "fit_to_fast" mode without enabling 
dynamic compaction levels is quite safe but definitely it's better to be 
tested in the real environment and under actual payload first. Also you 
might want to apply such a workaround gradually - one daemon first, bake 
it for a while, then apply for the full node, bake a bit more and 
finally go forward and update the remaining. Or even better - bake it in 
a test cluster first.


Alternatively you might consider building updated code yourself and make 
patched binaries on top of .14...



Thanks,

Igor


On 20/10/2023 15:10, Zakhar Kirpichenko wrote:

Thank you, Igor.

It is somewhat disappointing that fixing this bug in Pacific has such 
a low priority, considering its impact on existing clusters.


The document attached to the PR explicitly says about 
`level_compaction_dynamic_level_bytes` that "enabling it on an 
existing DB requires special caution", we'd rather not experiment with 
something that has the potential to cause data corruption or loss in a 
production cluster. Perhaps a downgrade to the previous version, 
16.2.13 which worked for us without any issues, is an option, or would 
you advise against such a downgrade from 16.2.14?


/Z

On Fri, 20 Oct 2023 at 14:46, Igor Fedotov  wrote:

Hi Zakhar,

Definitely we expect one more (and apparently the last) Pacific
minor release. There is no specific date yet though - the plans
are to release Quincy and Reef minor releases prior to it.
Hopefully to be done before the Christmas/New Year.

Meanwhile you might want to workaround the issue by tuning
bluestore_volume_selection_policy. Unfortunately most likely my
original proposal to set it to rocksdb_original wouldn't work in
this case so you better try "fit_to_fast" mode. This should be
coupled with enabling 'level_compaction_dynamic_level_bytes' mode
in RocksDB - there is pretty good spec on applying this mode to
BlueStore attached to https://github.com/ceph/ceph/pull/37156.


Thanks,

Igor

On 20/10/2023 06:03, Zakhar Kirpichenko wrote:

Igor, I noticed that there's no roadmap for the next 16.2.x
release. May I ask what time frame we are looking at with regards
to a possible fix?

We're experiencing several OSD crashes caused by this issue per day.

/Z

On Mon, 16 Oct 2023 at 14:19, Igor Fedotov
 wrote:

That's true.

On 16/10/2023 14:13, Zakhar Kirpichenko wrote:

Many thanks, Igor. I found previously submitted bug reports
and subscribed to them. My understanding is that the issue
is going to be fixed in the next Pacific minor release.

/Z

On Mon, 16 Oct 2023 at 14:03, Igor Fedotov
 wrote:

Hi Zakhar,

please see my reply for the post on the similar issue at:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/


Thanks,

Igor

On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
>     1. "assert_thread_name": "bstore_kv_sync",
>     2. "backtrace": [
>     3. "/lib64/libpthread.so.0(+0x12cf0)
[0x7ff2f6750cf0]",
>     4. "gsignal()",
>     5. "abort()",
>     6. "(ceph::__ceph_assert_fail(char const*, char
const*, int, char
>     const*)+0x1a9) [0x564dc5f87d0b]",
>     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
bluefs_fnode_t
>     const&)+0x15e) [0x564dc6604a9e]",
>     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*,
unsigned long, unsigned
>     long)+0x77d) [0x564dc66951cd]",
>     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool,
bool*)+0x90)
>     [0x564dc6695670]",
>     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
[0x564dc66b1a6b]",
>     12. "(BlueRocksWritableFile::Sync()+0x18)
[0x564dc66c1768]",
>     13.
"(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>     const&, rocksdb::IODebugContext*)+0x1f)
[0x564dc6b6496f]",
>     14.

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Tyler Stachecki
On Fri, Oct 20, 2023, 8:51 AM Zakhar Kirpichenko  wrote:

>
> We would consider upgrading, but unfortunately our Openstack Wallaby is
> holding us back as its cinder doesn't support Ceph 17.x, so we're stuck
> with having to find a solution for Ceph 16.x.
>

Wallaby is also quite old at this time... are you aware that the W release
of Cinder has foregone backport the critical, high-profile CVE-2023-2088
due to it's age?
https://github.com/openstack/cinder/commit/2fef6c41fa8c5ea772cde227a119dcf22ce7a07d

There was some tension over this at OpenInfra Summit this past year between
some of the operators and developers. Wallaby is still marked EM upstream,
but even so did not get this patch.

The story is, unfortunately, the same here: the only way out of some of
these holes is to upgrade...

Regards,
Tyler


>
> On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki 
> wrote:
>
>> On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko 
>> wrote:
>>
>>> Thank you, Igor.
>>>
>>> It is somewhat disappointing that fixing this bug in Pacific has such a
>>> low
>>> priority, considering its impact on existing clusters.
>>>
>>
>> Unfortunately, the hard truth here is that Pacific (stable) was released
>> over 30 months ago. It has had a good run for a freely distributed product,
>> and there's only so much time you can dedicate to backporting bugfixes --
>> it claws time away from other forward-thinking initiatives.
>>
>> Speaking from someone who's been at the helm of production clusters, I
>> know Ceph upgrades can be an experience and it's frustrating to hear, but
>> you have to jump sometime...
>>
>> Regards,
>> Tyler
>>
>>
>>> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov 
>>> wrote:
>>>
>>> > Hi Zakhar,
>>> >
>>> > Definitely we expect one more (and apparently the last) Pacific minor
>>> > release. There is no specific date yet though - the plans are to
>>> release
>>> > Quincy and Reef minor releases prior to it. Hopefully to be done
>>> before the
>>> > Christmas/New Year.
>>> >
>>> > Meanwhile you might want to workaround the issue by tuning
>>> > bluestore_volume_selection_policy. Unfortunately most likely my
>>> original
>>> > proposal to set it to rocksdb_original wouldn't work in this case so
>>> you
>>> > better try "fit_to_fast" mode. This should be coupled with enabling
>>> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is
>>> pretty
>>> > good spec on applying this mode to BlueStore attached to
>>> > https://github.com/ceph/ceph/pull/37156.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Igor
>>> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>>> >
>>> > Igor, I noticed that there's no roadmap for the next 16.2.x release.
>>> May I
>>> > ask what time frame we are looking at with regards to a possible fix?
>>> >
>>> > We're experiencing several OSD crashes caused by this issue per day.
>>> >
>>> > /Z
>>> >
>>> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov 
>>> wrote:
>>> >
>>> >> That's true.
>>> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>>> >>
>>> >> Many thanks, Igor. I found previously submitted bug reports and
>>> >> subscribed to them. My understanding is that the issue is going to be
>>> fixed
>>> >> in the next Pacific minor release.
>>> >>
>>> >> /Z
>>> >>
>>> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov 
>>> wrote:
>>> >>
>>> >>> Hi Zakhar,
>>> >>>
>>> >>> please see my reply for the post on the similar issue at:
>>> >>>
>>> >>>
>>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>>> >>>
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Igor
>>> >>>
>>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>>> >>> > Hi,
>>> >>> >
>>> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>>> >>> > in bstore_kv_sync thread:
>>> >>> >
>>> >>> >
>>> >>> > 1. "assert_thread_name": "bstore_kv_sync",
>>> >>> > 2. "backtrace": [
>>> >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>>> >>> > 4. "gsignal()",
>>> >>> > 5. "abort()",
>>> >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int,
>>> char
>>> >>> > const*)+0x1a9) [0x564dc5f87d0b]",
>>> >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>>> >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
>>> bluefs_fnode_t
>>> >>> > const&)+0x15e) [0x564dc6604a9e]",
>>> >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>>> >>> unsigned
>>> >>> > long)+0x77d) [0x564dc66951cd]",
>>> >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>>> >>> > [0x564dc6695670]",
>>> >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
>>> [0x564dc66b1a6b]",
>>> >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>>> >>> > 13.
>>> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>>> >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>>> >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>>> >>> > [0x564dc6c761c2]",

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Zakhar Kirpichenko
Thanks, Tyler. I appreciate what you're saying, though I can't fully agree:
16.2.13 didn't have crashing OSDs, so the crashes in 16.2.14 seem like a
regression - please correct me if I'm wrong. If it is indeed a regression,
then I'm not sure that suggesting to upgrade is the right thing to do in
this case.

We would consider upgrading, but unfortunately our Openstack Wallaby is
holding us back as its cinder doesn't support Ceph 17.x, so we're stuck
with having to find a solution for Ceph 16.x.

/Z

On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki 
wrote:

> On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko  wrote:
>
>> Thank you, Igor.
>>
>> It is somewhat disappointing that fixing this bug in Pacific has such a
>> low
>> priority, considering its impact on existing clusters.
>>
>
> Unfortunately, the hard truth here is that Pacific (stable) was released
> over 30 months ago. It has had a good run for a freely distributed product,
> and there's only so much time you can dedicate to backporting bugfixes --
> it claws time away from other forward-thinking initiatives.
>
> Speaking from someone who's been at the helm of production clusters, I
> know Ceph upgrades can be an experience and it's frustrating to hear, but
> you have to jump sometime...
>
> Regards,
> Tyler
>
>
>> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov  wrote:
>>
>> > Hi Zakhar,
>> >
>> > Definitely we expect one more (and apparently the last) Pacific minor
>> > release. There is no specific date yet though - the plans are to release
>> > Quincy and Reef minor releases prior to it. Hopefully to be done before
>> the
>> > Christmas/New Year.
>> >
>> > Meanwhile you might want to workaround the issue by tuning
>> > bluestore_volume_selection_policy. Unfortunately most likely my original
>> > proposal to set it to rocksdb_original wouldn't work in this case so you
>> > better try "fit_to_fast" mode. This should be coupled with enabling
>> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
>> > good spec on applying this mode to BlueStore attached to
>> > https://github.com/ceph/ceph/pull/37156.
>> >
>> >
>> > Thanks,
>> >
>> > Igor
>> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>> >
>> > Igor, I noticed that there's no roadmap for the next 16.2.x release.
>> May I
>> > ask what time frame we are looking at with regards to a possible fix?
>> >
>> > We're experiencing several OSD crashes caused by this issue per day.
>> >
>> > /Z
>> >
>> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov 
>> wrote:
>> >
>> >> That's true.
>> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>> >>
>> >> Many thanks, Igor. I found previously submitted bug reports and
>> >> subscribed to them. My understanding is that the issue is going to be
>> fixed
>> >> in the next Pacific minor release.
>> >>
>> >> /Z
>> >>
>> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov 
>> wrote:
>> >>
>> >>> Hi Zakhar,
>> >>>
>> >>> please see my reply for the post on the similar issue at:
>> >>>
>> >>>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Igor
>> >>>
>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>> >>> > Hi,
>> >>> >
>> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>> >>> > in bstore_kv_sync thread:
>> >>> >
>> >>> >
>> >>> > 1. "assert_thread_name": "bstore_kv_sync",
>> >>> > 2. "backtrace": [
>> >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>> >>> > 4. "gsignal()",
>> >>> > 5. "abort()",
>> >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int,
>> char
>> >>> > const*)+0x1a9) [0x564dc5f87d0b]",
>> >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>> >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
>> bluefs_fnode_t
>> >>> > const&)+0x15e) [0x564dc6604a9e]",
>> >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>> >>> unsigned
>> >>> > long)+0x77d) [0x564dc66951cd]",
>> >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>> >>> > [0x564dc6695670]",
>> >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
>> [0x564dc66b1a6b]",
>> >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>> >>> > 13.
>> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>> >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>> >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>> >>> > [0x564dc6c761c2]",
>> >>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>> >>> [0x564dc6c77808]",
>> >>> > 16.
>> "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>> >>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool,
>> unsigned
>> >>> > long)+0x309) [0x564dc6b780c9]",
>> >>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>> >>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>> 

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Tyler Stachecki
On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko  wrote:

> Thank you, Igor.
>
> It is somewhat disappointing that fixing this bug in Pacific has such a low
> priority, considering its impact on existing clusters.
>

Unfortunately, the hard truth here is that Pacific (stable) was released
over 30 months ago. It has had a good run for a freely distributed product,
and there's only so much time you can dedicate to backporting bugfixes --
it claws time away from other forward-thinking initiatives.

Speaking from someone who's been at the helm of production clusters, I know
Ceph upgrades can be an experience and it's frustrating to hear, but you
have to jump sometime...

Regards,
Tyler


> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov  wrote:
>
> > Hi Zakhar,
> >
> > Definitely we expect one more (and apparently the last) Pacific minor
> > release. There is no specific date yet though - the plans are to release
> > Quincy and Reef minor releases prior to it. Hopefully to be done before
> the
> > Christmas/New Year.
> >
> > Meanwhile you might want to workaround the issue by tuning
> > bluestore_volume_selection_policy. Unfortunately most likely my original
> > proposal to set it to rocksdb_original wouldn't work in this case so you
> > better try "fit_to_fast" mode. This should be coupled with enabling
> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
> > good spec on applying this mode to BlueStore attached to
> > https://github.com/ceph/ceph/pull/37156.
> >
> >
> > Thanks,
> >
> > Igor
> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
> >
> > Igor, I noticed that there's no roadmap for the next 16.2.x release. May
> I
> > ask what time frame we are looking at with regards to a possible fix?
> >
> > We're experiencing several OSD crashes caused by this issue per day.
> >
> > /Z
> >
> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov 
> wrote:
> >
> >> That's true.
> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
> >>
> >> Many thanks, Igor. I found previously submitted bug reports and
> >> subscribed to them. My understanding is that the issue is going to be
> fixed
> >> in the next Pacific minor release.
> >>
> >> /Z
> >>
> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov 
> wrote:
> >>
> >>> Hi Zakhar,
> >>>
> >>> please see my reply for the post on the similar issue at:
> >>>
> >>>
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Igor
> >>>
> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> >>> > Hi,
> >>> >
> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
> >>> > in bstore_kv_sync thread:
> >>> >
> >>> >
> >>> > 1. "assert_thread_name": "bstore_kv_sync",
> >>> > 2. "backtrace": [
> >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
> >>> > 4. "gsignal()",
> >>> > 5. "abort()",
> >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
> >>> > const*)+0x1a9) [0x564dc5f87d0b]",
> >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
> >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
> >>> > const&)+0x15e) [0x564dc6604a9e]",
> >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
> >>> unsigned
> >>> > long)+0x77d) [0x564dc66951cd]",
> >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
> >>> > [0x564dc6695670]",
> >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
> [0x564dc66b1a6b]",
> >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
> >>> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
> >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
> >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
> >>> > [0x564dc6c761c2]",
> >>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
> >>> [0x564dc6c77808]",
> >>> > 16.
> "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
> >>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool,
> unsigned
> >>> > long)+0x309) [0x564dc6b780c9]",
> >>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
> >>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
> >>> unsigned
> >>> > long, bool, unsigned long*, unsigned long,
> >>> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
> >>> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
> >>> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
> >>> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
> >>> > std::shared_ptr)+0x84)
> >>> [0x564dc6b1f644]",
> >>> > 20.
> >>>
> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
> >>> > [0x564dc6b2004a]",
> >>> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
> >>> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
> >>> > 23. 

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Zakhar Kirpichenko
Thank you, Igor.

It is somewhat disappointing that fixing this bug in Pacific has such a low
priority, considering its impact on existing clusters.

The document attached to the PR explicitly says about
`level_compaction_dynamic_level_bytes` that "enabling it on an existing DB
requires special caution", we'd rather not experiment with something that
has the potential to cause data corruption or loss in a production cluster.
Perhaps a downgrade to the previous version, 16.2.13 which worked for us
without any issues, is an option, or would you advise against such a
downgrade from 16.2.14?

/Z

On Fri, 20 Oct 2023 at 14:46, Igor Fedotov  wrote:

> Hi Zakhar,
>
> Definitely we expect one more (and apparently the last) Pacific minor
> release. There is no specific date yet though - the plans are to release
> Quincy and Reef minor releases prior to it. Hopefully to be done before the
> Christmas/New Year.
>
> Meanwhile you might want to workaround the issue by tuning
> bluestore_volume_selection_policy. Unfortunately most likely my original
> proposal to set it to rocksdb_original wouldn't work in this case so you
> better try "fit_to_fast" mode. This should be coupled with enabling
> 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
> good spec on applying this mode to BlueStore attached to
> https://github.com/ceph/ceph/pull/37156.
>
>
> Thanks,
>
> Igor
> On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>
> Igor, I noticed that there's no roadmap for the next 16.2.x release. May I
> ask what time frame we are looking at with regards to a possible fix?
>
> We're experiencing several OSD crashes caused by this issue per day.
>
> /Z
>
> On Mon, 16 Oct 2023 at 14:19, Igor Fedotov  wrote:
>
>> That's true.
>> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>>
>> Many thanks, Igor. I found previously submitted bug reports and
>> subscribed to them. My understanding is that the issue is going to be fixed
>> in the next Pacific minor release.
>>
>> /Z
>>
>> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov  wrote:
>>
>>> Hi Zakhar,
>>>
>>> please see my reply for the post on the similar issue at:
>>>
>>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>>>
>>>
>>> Thanks,
>>>
>>> Igor
>>>
>>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>>> > Hi,
>>> >
>>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>>> > in bstore_kv_sync thread:
>>> >
>>> >
>>> > 1. "assert_thread_name": "bstore_kv_sync",
>>> > 2. "backtrace": [
>>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>>> > 4. "gsignal()",
>>> > 5. "abort()",
>>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> > const*)+0x1a9) [0x564dc5f87d0b]",
>>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
>>> > const&)+0x15e) [0x564dc6604a9e]",
>>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>>> unsigned
>>> > long)+0x77d) [0x564dc66951cd]",
>>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>>> > [0x564dc6695670]",
>>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
>>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>>> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>>> > [0x564dc6c761c2]",
>>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>>> [0x564dc6c77808]",
>>> > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
>>> > long)+0x309) [0x564dc6b780c9]",
>>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>>> unsigned
>>> > long, bool, unsigned long*, unsigned long,
>>> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>>> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>>> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>>> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>>> > std::shared_ptr)+0x84)
>>> [0x564dc6b1f644]",
>>> > 20.
>>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>>> > [0x564dc6b2004a]",
>>> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>>> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>>> > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>>> > 24. "clone()"
>>> > 25. ],
>>> >
>>> >
>>> > I am attaching two instances of crash info for further reference:
>>> > https://pastebin.com/E6myaHNU
>>> >
>>> > OSD configuration is rather simple and close to default:
>>> >
>>> > osd.6 dev   

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-20 Thread Igor Fedotov

Hi Zakhar,

Definitely we expect one more (and apparently the last) Pacific minor 
release. There is no specific date yet though - the plans are to release 
Quincy and Reef minor releases prior to it. Hopefully to be done before 
the Christmas/New Year.


Meanwhile you might want to workaround the issue by tuning 
bluestore_volume_selection_policy. Unfortunately most likely my original 
proposal to set it to rocksdb_original wouldn't work in this case so you 
better try "fit_to_fast" mode. This should be coupled with enabling 
'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty 
good spec on applying this mode to BlueStore attached to 
https://github.com/ceph/ceph/pull/37156.



Thanks,

Igor

On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
Igor, I noticed that there's no roadmap for the next 16.2.x release. 
May I ask what time frame we are looking at with regards to a possible 
fix?


We're experiencing several OSD crashes caused by this issue per day.

/Z

On Mon, 16 Oct 2023 at 14:19, Igor Fedotov  wrote:

That's true.

On 16/10/2023 14:13, Zakhar Kirpichenko wrote:

Many thanks, Igor. I found previously submitted bug reports and
subscribed to them. My understanding is that the issue is going
to be fixed in the next Pacific minor release.

/Z

On Mon, 16 Oct 2023 at 14:03, Igor Fedotov
 wrote:

Hi Zakhar,

please see my reply for the post on the similar issue at:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/


Thanks,

Igor

On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
>     1. "assert_thread_name": "bstore_kv_sync",
>     2. "backtrace": [
>     3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>     4. "gsignal()",
>     5. "abort()",
>     6. "(ceph::__ceph_assert_fail(char const*, char const*,
int, char
>     const*)+0x1a9) [0x564dc5f87d0b]",
>     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
bluefs_fnode_t
>     const&)+0x15e) [0x564dc6604a9e]",
>     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*,
unsigned long, unsigned
>     long)+0x77d) [0x564dc66951cd]",
>     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool,
bool*)+0x90)
>     [0x564dc6695670]",
>     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
[0x564dc66b1a6b]",
>     12. "(BlueRocksWritableFile::Sync()+0x18)
[0x564dc66c1768]",
>     13.
"(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>     const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>     14.
"(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>     [0x564dc6c761c2]",
>     15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
[0x564dc6c77808]",
>     16.
"(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>     const&, rocksdb::log::Writer*, unsigned long*, bool,
bool, unsigned
>     long)+0x309) [0x564dc6b780c9]",
>     17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions
const&,
>     rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned
long*, unsigned
>     long, bool, unsigned long*, unsigned long,
>     rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>     19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>  std::shared_ptr)+0x84)
[0x564dc6b1f644]",
>     20.

"(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>     [0x564dc6b2004a]",
>     21. "(BlueStore::_kv_sync_thread()+0x30d8)
[0x564dc6602ec8]",
>     22. "(BlueStore::KVSyncThread::entry()+0x11)
[0x564dc662ab61]",
>     23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>     24. "clone()"
>     25. ],
>
>
> I am attaching two instances of crash info for further
reference:
> https://pastebin.com/E6myaHNU
>
> OSD configuration is rather simple and close to default:
>
> osd.6         dev       bluestore_cache_size_hdd          
4294967296
>                                            osd.6        dev
> bluestore_cache_size_ssd            4294967296
>                    osd           advanced debug_rocksdb
>    1/5                      osd
>          advanced  osd_max_backfills      2
> osd       

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-19 Thread Zakhar Kirpichenko
Igor, I noticed that there's no roadmap for the next 16.2.x release. May I
ask what time frame we are looking at with regards to a possible fix?

We're experiencing several OSD crashes caused by this issue per day.

/Z

On Mon, 16 Oct 2023 at 14:19, Igor Fedotov  wrote:

> That's true.
> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>
> Many thanks, Igor. I found previously submitted bug reports and subscribed
> to them. My understanding is that the issue is going to be fixed in the
> next Pacific minor release.
>
> /Z
>
> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov  wrote:
>
>> Hi Zakhar,
>>
>> please see my reply for the post on the similar issue at:
>>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>>
>>
>> Thanks,
>>
>> Igor
>>
>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>> > Hi,
>> >
>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>> > in bstore_kv_sync thread:
>> >
>> >
>> > 1. "assert_thread_name": "bstore_kv_sync",
>> > 2. "backtrace": [
>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>> > 4. "gsignal()",
>> > 5. "abort()",
>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
>> > const*)+0x1a9) [0x564dc5f87d0b]",
>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
>> > const&)+0x15e) [0x564dc6604a9e]",
>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>> unsigned
>> > long)+0x77d) [0x564dc66951cd]",
>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>> > [0x564dc6695670]",
>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>> > [0x564dc6c761c2]",
>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>> [0x564dc6c77808]",
>> > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
>> > long)+0x309) [0x564dc6b780c9]",
>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>> unsigned
>> > long, bool, unsigned long*, unsigned long,
>> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>> > std::shared_ptr)+0x84)
>> [0x564dc6b1f644]",
>> > 20.
>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>> > [0x564dc6b2004a]",
>> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>> > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>> > 24. "clone()"
>> > 25. ],
>> >
>> >
>> > I am attaching two instances of crash info for further reference:
>> > https://pastebin.com/E6myaHNU
>> >
>> > OSD configuration is rather simple and close to default:
>> >
>> > osd.6 dev   bluestore_cache_size_hdd4294967296
>> >osd.6 dev
>> > bluestore_cache_size_ssd4294967296
>> >osd   advanced  debug_rocksdb
>> >1/5
>>  osd
>> >  advanced  osd_max_backfills   2
>> >  osd   basic
>> > osd_memory_target   17179869184
>> >  osd   advanced  osd_recovery_max_active
>> >  2 osd
>> >  advanced  osd_scrub_sleep 0.10
>> >osd   advanced
>> >   rbd_balance_parent_readsfalse
>> >
>> > debug_rocksdb is a recent change, otherwise this configuration has been
>> > running without issues for months. The crashes happened on two different
>> > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
>> > block) don't exhibit any issues. We have not experienced such crashes
>> with
>> > Ceph < 16.2.14.
>> >
>> > Is this a known issue, or should I open a bug report?
>> >
>> > Best regards,
>> > Zakhar
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-16 Thread Igor Fedotov

That's true.

On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
Many thanks, Igor. I found previously submitted bug reports and 
subscribed to them. My understanding is that the issue is going to be 
fixed in the next Pacific minor release.


/Z

On Mon, 16 Oct 2023 at 14:03, Igor Fedotov  wrote:

Hi Zakhar,

please see my reply for the post on the similar issue at:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/


Thanks,

Igor

On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
>     1. "assert_thread_name": "bstore_kv_sync",
>     2. "backtrace": [
>     3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>     4. "gsignal()",
>     5. "abort()",
>     6. "(ceph::__ceph_assert_fail(char const*, char const*, int,
char
>     const*)+0x1a9) [0x564dc5f87d0b]",
>     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
bluefs_fnode_t
>     const&)+0x15e) [0x564dc6604a9e]",
>     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned
long, unsigned
>     long)+0x77d) [0x564dc66951cd]",
>     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>     [0x564dc6695670]",
>     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
[0x564dc66b1a6b]",
>     12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>     13.
"(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>     const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>     14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>     [0x564dc6c761c2]",
>     15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
[0x564dc6c77808]",
>     16.
"(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>     const&, rocksdb::log::Writer*, unsigned long*, bool, bool,
unsigned
>     long)+0x309) [0x564dc6b780c9]",
>     17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>     rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned
long*, unsigned
>     long, bool, unsigned long*, unsigned long,
>     rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>     19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>  std::shared_ptr)+0x84)
[0x564dc6b1f644]",
>     20.

"(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>     [0x564dc6b2004a]",
>     21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>     22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>     23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>     24. "clone()"
>     25. ],
>
>
> I am attaching two instances of crash info for further reference:
> https://pastebin.com/E6myaHNU
>
> OSD configuration is rather simple and close to default:
>
> osd.6         dev       bluestore_cache_size_hdd   4294967296
>                                            osd.6  dev
> bluestore_cache_size_ssd            4294967296
>                    osd           advanced  debug_rocksdb
>    1/5              osd
>          advanced  osd_max_backfills                   2
>                                                  osd      basic
> osd_memory_target                   17179869184
>                      osd           advanced osd_recovery_max_active
>      2          osd
>      advanced  osd_scrub_sleep  0.10
>                                        osd  advanced
>   rbd_balance_parent_reads            false
>
> debug_rocksdb is a recent change, otherwise this configuration
has been
> running without issues for months. The crashes happened on two
different
> hosts with identical hardware, the hosts and storage (NVME
DB/WAL, HDD
> block) don't exhibit any issues. We have not experienced such
crashes with
> Ceph < 16.2.14.
>
> Is this a known issue, or should I open a bug report?
>
> Best regards,
> Zakhar
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-16 Thread Zakhar Kirpichenko
Many thanks, Igor. I found previously submitted bug reports and subscribed
to them. My understanding is that the issue is going to be fixed in the
next Pacific minor release.

/Z

On Mon, 16 Oct 2023 at 14:03, Igor Fedotov  wrote:

> Hi Zakhar,
>
> please see my reply for the post on the similar issue at:
>
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>
>
> Thanks,
>
> Igor
>
> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> > Hi,
> >
> > After upgrading to Ceph 16.2.14 we had several OSD crashes
> > in bstore_kv_sync thread:
> >
> >
> > 1. "assert_thread_name": "bstore_kv_sync",
> > 2. "backtrace": [
> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
> > 4. "gsignal()",
> > 5. "abort()",
> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x1a9) [0x564dc5f87d0b]",
> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
> > const&)+0x15e) [0x564dc6604a9e]",
> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
> unsigned
> > long)+0x77d) [0x564dc66951cd]",
> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
> > [0x564dc6695670]",
> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
> > [0x564dc6c761c2]",
> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
> [0x564dc6c77808]",
> > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
> > long)+0x309) [0x564dc6b780c9]",
> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
> unsigned
> > long, bool, unsigned long*, unsigned long,
> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
> > std::shared_ptr)+0x84)
> [0x564dc6b1f644]",
> > 20.
> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
> > [0x564dc6b2004a]",
> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
> > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
> > 24. "clone()"
> > 25. ],
> >
> >
> > I am attaching two instances of crash info for further reference:
> > https://pastebin.com/E6myaHNU
> >
> > OSD configuration is rather simple and close to default:
> >
> > osd.6 dev   bluestore_cache_size_hdd4294967296
> >osd.6 dev
> > bluestore_cache_size_ssd4294967296
> >osd   advanced  debug_rocksdb
> >1/5
>  osd
> >  advanced  osd_max_backfills   2
> >  osd   basic
> > osd_memory_target   17179869184
> >  osd   advanced  osd_recovery_max_active
> >  2 osd
> >  advanced  osd_scrub_sleep 0.10
> >osd   advanced
> >   rbd_balance_parent_readsfalse
> >
> > debug_rocksdb is a recent change, otherwise this configuration has been
> > running without issues for months. The crashes happened on two different
> > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
> > block) don't exhibit any issues. We have not experienced such crashes
> with
> > Ceph < 16.2.14.
> >
> > Is this a known issue, or should I open a bug report?
> >
> > Best regards,
> > Zakhar
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-16 Thread Igor Fedotov

Hi Zakhar,

please see my reply for the post on the similar issue at: 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/



Thanks,

Igor

On 16/10/2023 09:26, Zakhar Kirpichenko wrote:

Hi,

After upgrading to Ceph 16.2.14 we had several OSD crashes
in bstore_kv_sync thread:


1. "assert_thread_name": "bstore_kv_sync",
2. "backtrace": [
3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
4. "gsignal()",
5. "abort()",
6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x1a9) [0x564dc5f87d0b]",
7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
const&)+0x15e) [0x564dc6604a9e]",
9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned
long)+0x77d) [0x564dc66951cd]",
10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
[0x564dc6695670]",
11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
[0x564dc6c761c2]",
15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]",
16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
long)+0x309) [0x564dc6b780c9]",
17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned
long, bool, unsigned long*, unsigned long,
rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
std::shared_ptr)+0x84) [0x564dc6b1f644]",
20. 
"(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
[0x564dc6b2004a]",
21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
24. "clone()"
25. ],


I am attaching two instances of crash info for further reference:
https://pastebin.com/E6myaHNU

OSD configuration is rather simple and close to default:

osd.6 dev   bluestore_cache_size_hdd4294967296
   osd.6 dev
bluestore_cache_size_ssd4294967296
   osd   advanced  debug_rocksdb
   1/5 osd
 advanced  osd_max_backfills   2
 osd   basic
osd_memory_target   17179869184
 osd   advanced  osd_recovery_max_active
 2 osd
 advanced  osd_scrub_sleep 0.10
   osd   advanced
  rbd_balance_parent_readsfalse

debug_rocksdb is a recent change, otherwise this configuration has been
running without issues for months. The crashes happened on two different
hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
block) don't exhibit any issues. We have not experienced such crashes with
Ceph < 16.2.14.

Is this a known issue, or should I open a bug report?

Best regards,
Zakhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-16 Thread Zakhar Kirpichenko
Unfortunately, the OSD log from the earlier crash is not available. I have
extracted the OSD log, including the recent events, from the latest crash:
https://www.dropbox.com/scl/fi/1ne8h85iuc5vx78qm1t93/20231016_osd6.zip?rlkey=fxyn242q7c69ec5lkv29csx13=0
I hope this helps to identify the crash reason.

The log entries that I find suspicious are these right before the crash:

debug  -1726> 2023-10-15T22:31:21.575+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17024319488 unmapped: 4164763648
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
...
debug  -1723> 2023-10-15T22:31:22.579+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17024589824 unmapped: 4164493312
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
...
debug  -1718> 2023-10-15T22:31:23.579+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17027031040 unmapped: 4162052096
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
...
debug  -1714> 2023-10-15T22:31:24.579+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17026301952 unmapped: 4162781184
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
debug  -1713> 2023-10-15T22:31:25.383+ 7f961ccb8700  5
bluestore.MempoolThread(0x55c5bee8cb98) _resize_shards cache_size:
13797582406 kv_alloc: 8321499136 kv_used: 8245313424 kv_onode_alloc:
4697620480 kv_onode_used: 4690617424 meta_alloc: 469762048 meta_used:
371122625 data_alloc: 134217728 data_used: 44314624
...
debug  -1710> 2023-10-15T22:31:25.583+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17026367488 unmapped: 4162715648
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
...
debug  -1707> 2023-10-15T22:31:26.583+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17026211840 unmapped: 4162871296
heap: 21189083136 old mem: 13797582406 new mem: 13797582406
...
debug  -1704> 2023-10-15T22:31:27.583+ 7f961ccb8700  5 prioritycache
tune_memory target: 17179869184 mapped: 17024548864 unmapped: 4164534272
heap: 21189083136 old mem: 13797582406 new mem: 13797582406

There's plenty of RAM in the system, about 120 GB free and used for cache.

/Z

On Mon, 16 Oct 2023 at 09:26, Zakhar Kirpichenko  wrote:

> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
>1. "assert_thread_name": "bstore_kv_sync",
>2. "backtrace": [
>3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>4. "gsignal()",
>5. "abort()",
>6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
>const*)+0x1a9) [0x564dc5f87d0b]",
>7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
>const&)+0x15e) [0x564dc6604a9e]",
>9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>unsigned long)+0x77d) [0x564dc66951cd]",
>10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>[0x564dc6695670]",
>11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
>12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>[0x564dc6c761c2]",
>15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]",
>16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
>long)+0x309) [0x564dc6b780c9]",
>17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned
>long, bool, unsigned long*, unsigned long,
>rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>std::shared_ptr)+0x84) [0x564dc6b1f644]",
>20. 
> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>[0x564dc6b2004a]",
>21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>24. "clone()"
>25. ],
>
>
> I am attaching two instances of crash info for further reference:
> https://pastebin.com/E6myaHNU
>
> OSD configuration is rather simple and close to default:
>
> osd.6 dev   bluestore_cache_size_hdd4294967296
>   osd.6 dev
> bluestore_cache_size_ssd4294967296
>   osd   advanced  debug_rocksdb
>   1/5 osd
> advanced  

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

2023-10-16 Thread Zakhar Kirpichenko
Not sure how it managed to screw up formatting, OSD configuration in a more
readable form: https://pastebin.com/mrC6UdzN

/Z

On Mon, 16 Oct 2023 at 09:26, Zakhar Kirpichenko  wrote:

> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
>1. "assert_thread_name": "bstore_kv_sync",
>2. "backtrace": [
>3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>4. "gsignal()",
>5. "abort()",
>6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
>const*)+0x1a9) [0x564dc5f87d0b]",
>7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
>const&)+0x15e) [0x564dc6604a9e]",
>9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>unsigned long)+0x77d) [0x564dc66951cd]",
>10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>[0x564dc6695670]",
>11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
>12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>[0x564dc6c761c2]",
>15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]",
>16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
>long)+0x309) [0x564dc6b780c9]",
>17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned
>long, bool, unsigned long*, unsigned long,
>rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>std::shared_ptr)+0x84) [0x564dc6b1f644]",
>20. 
> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a)
>[0x564dc6b2004a]",
>21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>24. "clone()"
>25. ],
>
>
> I am attaching two instances of crash info for further reference:
> https://pastebin.com/E6myaHNU
>
> OSD configuration is rather simple and close to default:
>
> osd.6 dev   bluestore_cache_size_hdd4294967296
>   osd.6 dev
> bluestore_cache_size_ssd4294967296
>   osd   advanced  debug_rocksdb
>   1/5 osd
> advanced  osd_max_backfills   2
> osd   basic
> osd_memory_target   17179869184
> osd   advanced  osd_recovery_max_active
> 2 osd
> advanced  osd_scrub_sleep 0.10
>   osd   advanced
>  rbd_balance_parent_readsfalse
>
> debug_rocksdb is a recent change, otherwise this configuration has been
> running without issues for months. The crashes happened on two different
> hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
> block) don't exhibit any issues. We have not experienced such crashes with
> Ceph < 16.2.14.
>
> Is this a known issue, or should I open a bug report?
>
> Best regards,
> Zakhar
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io