Hi Frank,
I don't have an operational workaround, the patch
https://github.com/ceph/ceph/pull/46911/commits/f43f596aac97200a70db7a70a230eb9343018159
is simple and can be applied cleanly.
Yes, restarting the OSD will clear pool entries, you can restart it when
the bluestore_onode items are very
ceph-users@ceph.io
Cc: d...@ceph.io
Subject: Re: [ceph-users] Re: OSD crash on Onode::put
Hi Frank,
IMO all the below logic is a bit of overkill and no one can provide 100% valid
guidance on specific numbers atm. Generally I agree with Dongdong's point that
crash is effectively an OSD restart and hen
To: Serkan Çoban; Anthony D'Atri
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: OSD crash on Onode::put
Hi Anthony and Serkan,
I checked the drive temperatures and there is nothing special about this slot.
The disks in this slot are from different vendors and were not populated
incrementally
king here
how few onode items are acceptable before performance drops painfully.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____________________
From: Igor Fedotov
Sent: 09 January 2023 13:34:42
To: Dongdong Tao;ceph-users@ceph.io
Cc:d.
Hi Anthony and Serkan,
I checked the drive temperatures and there is nothing special about this slot.
The disks in this slot are from different vendors and were not populated
incrementally. It might be a very weird coincidence. I seem to have an OSD
developing this problem in another slot on a
.@gmail.com
Subject: Re: [ceph-users] Re: OSD crash on Onode::put
Hi Frank,
I don't have an operational workaround, the patch
https://github.com/ceph/ceph/pull/46911/commits/f43f596aac97200a70db7a70a230eb9343018159
is simple and can be applied cleanly.
Yes, restarting the OSD will clear pool e
Could this be a temporal co-incidence? E.g. each host got a different model
drive in slot 19 via an incremental expansion.
> On Jan 10, 2023, at 05:27, Frank Schilder wrote:
>
> Following up on my previous post, we have identical OSD hosts. The very
> strange observation now is, that all
Slot 19 is inside the chassis? Do you check chassis temperature? I
sometimes have more failure rate in chassis HDDs than in front of the
chassis. In our case it was related to the temperature difference.
On Tue, Jan 10, 2023 at 1:28 PM Frank Schilder wrote:
>
> Following up on my previous post,
Following up on my previous post, we have identical OSD hosts. The very strange
observation now is, that all outlier OSDs are in exactly the same disk slot on
these hosts. We have 5 problematic OSDs and they are all in slot 19 on 5
different hosts. This is an extremely strange and unlikely
king here
how few onode items are acceptable before performance drops painfully.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____________________
From: Igor Fedotov
Sent: 09 January 2023 13:34:42
To: Dongdong Tao; ceph-users@ceph.io
Cc: d.
Hi Dongdong,
thanks a lot for your post, it's really helpful.
Thanks,
Igor
On 1/5/2023 6:12 AM, Dongdong Tao wrote:
I see many users recently reporting that they have been struggling
with this Onode::put race condition issue[1] on both the latest
Octopus and pacific.
Igor opened a PR [2]
11 matches
Mail list logo