Marco,
this validation was introduced in v18.2.5 as not following the rule
could result in OSD crash in some cases.
So better to catch that sooner than later.
Thanks,
Igor
On 29.04.2025 14:27, Marco Pizzolo wrote:
Hi Igor,
Thank you so very much for responding so quickly. Interestingly, I
don't remember setting these values, but I did see a global level
override for 0.8 on one, and 0.2 on another, so I removed the global
overrides and am rebooting the server to see what happens.
I should know soon enough how things are looking.
I'll report back, but I don't understand why I would have been able to
upgrade this over the past 4-5 years from 14 --> 15 --> 16 --> 17 -->
18.2.4 without issues, but now going from 18.2.4 --> 18.2.6 I am dead
in the water.
Thanks,
Marco
On Tue, Apr 29, 2025 at 1:18 PM Igor Fedotov <igor.fedo...@croit.io>
wrote:
Hi Marco,
the following log line (unfortunately it was cut off) sheds some
light:
"
Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
2025-04-29T10:24:09.287+0000 7f6961ae9740 -1
bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes
bluestore_cache_meta_>
"
Likely it says that sum of bluestore_cache_meta_ratio +
bluestore_cache_kv_ratio + bluestore_cache_kv_onode_ratio config
parameters exceeds 1.0
So one has to tune the parameters in a way to get the sum less or
equal
to 1.0.
Default settings are:
bluestore_cache_meta_ratio = 0.45
bluestore_cache_kv_ratio = 0.45
bluestore_cache_kv_onode_ratio = 0.04
Thanks,
Igor
On 29.04.2025 13:36, Marco Pizzolo wrote:
> Hello Everyone,
>
> I'm upgrading from 18.2.4 to 18.2.6, and I have a 4-node cluster
with 8
> NVMe's per node. Each NVMe is split into 2 OSDs. The upgrade
went through
> the mgr, mon, crash and began upgrading OSDs.
>
> The OSDs it was upgrading were not coming back online.
>
> I tried rebooting, and no luck.
>
> journalctl -xe shows the following:
>
> ░░ The unit
>
docker-02cb79ef9a657cdaa26b781966aa6d2f1d5e54cdc9efa6c5ff1f0e98c3a866e4.scope
> has successfully entered the 'dead' state.
> Apr 29 06:24:09 prdhcistonode01 dockerd[2967]:
> time="2025-04-29T06:24:09.282073583-04:00" level=info
msg="ignoring event"
> container=76c56ddd668015de0022bfa2527060e64a9513>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.282129114-04:00" level=info msg="shim
> disconnected" id=76c56ddd668015de0022bfa2527060e64a95137>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.282219664-04:00" level=warning
msg="cleaning up
> after shim disconnected" id=76c56ddd668015de00>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.282242484-04:00" level=info
msg="cleaning up dead
> shim"
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 mClockScheduler:
> set_osd_capacity_params_from_config: osd_bandwidth_cost_p>
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 0 osd.3:0.OSDShard
using op
> scheduler mclock_scheduler, cutoff=196
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000
> /var/lib/ceph/osd/ceph-3/block) open path /var/lib/cep>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.292047607-04:00" level=warning
msg="cleanup
> warnings time=\"2025-04-29T06:24:09-04:00\" level=>
> Apr 29 06:24:09 prdhcistonode01 dockerd[2967]:
> time="2025-04-29T06:24:09.292163618-04:00" level=info
msg="ignoring event"
> container=02cb79ef9a657cdaa26b781966aa6d2f1d5e54>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.292216428-04:00" level=info msg="shim
> disconnected" id=02cb79ef9a657cdaa26b781966aa6d2f1d5e54c>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.292277279-04:00" level=warning
msg="cleaning up
> after shim disconnected" id=02cb79ef9a657cdaa2>
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.292291949-04:00" level=info
msg="cleaning up dead
> shim"
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000
> /var/lib/ceph/osd/ceph-3/block) open size 640122932428>
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 -1
> bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes
bluestore_cache_meta_>
> Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug
> 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000
> /var/lib/ceph/osd/ceph-3/block) close
> Apr 29 06:24:09 prdhcistonode01 containerd[2797]:
> time="2025-04-29T06:24:09.303385220-04:00" level=warning
msg="cleanup
> warnings time=\"2025-04-29T06:24:09-04:00\" level=>
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 1 mClockScheduler:
> set_osd_capacity_params_from_config: osd_bandwidth_cost_p>
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 0 osd.0:0.OSDShard
using op
> scheduler mclock_scheduler, cutoff=196
> Apr 29 06:24:09 prdhcistonode01 bash[23144]: debug
> 2025-04-29T10:24:09.307+0000 7f12f08c5740 -1 osd.15 0 OSD:init:
unable to
> mount object store
> Apr 29 06:24:09 prdhcistonode01 bash[23144]: debug
> 2025-04-29T10:24:09.307+0000 7f12f08c5740 -1 ** ERROR: osd init
failed:
> (22) Invalid argument
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000
> /var/lib/ceph/osd/ceph-0/block) open path /var/lib/cep>
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000
> /var/lib/ceph/osd/ceph-0/block) open size 640122932428>
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 -1
> bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes
bluestore_cache_meta_>
> Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug
> 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000
> /var/lib/ceph/osd/ceph-0/block) close
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 mClockScheduler:
> set_osd_capacity_params_from_config: osd_bandwidth_cost_p>
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 0 osd.8:0.OSDShard
using op
> scheduler mclock_scheduler, cutoff=196
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000
> /var/lib/ceph/osd/ceph-8/block) open path /var/lib/cep>
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000
> /var/lib/ceph/osd/ceph-8/block) open size 640122932428>
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 -1
> bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes
bluestore_cache_meta_>
> Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug
> 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000
> /var/lib/ceph/osd/ceph-8/block) close
> Apr 29 06:24:09 prdhcistonode01 systemd[1]:
> ceph-fbc38f5c-a3a6-11ea-805c-3b954db9ce7a@osd.12.service: Main
process
> exited, code=exited, status=1/FAILURE
>
>
> Any help you can offer would be greatly appreciated. This is
running in
> docker:
>
> Client: Docker Engine - Community
> Version: 24.0.7
> API version: 1.43
> Go version: go1.20.10
> Git commit: afdd53b
> Built: Thu Oct 26 09:08:01 2023
> OS/Arch: linux/amd64
> Context: default
>
> Server: Docker Engine - Community
> Engine:
> Version: 24.0.7
> API version: 1.43 (minimum version 1.12)
> Go version: go1.20.10
> Git commit: 311b9ff
> Built: Thu Oct 26 09:08:01 2023
> OS/Arch: linux/amd64
> Experimental: false
> containerd:
> Version: 1.6.25
> GitCommit: d8f198a4ed8892c764191ef7b3b06d8a2eeb5c7f
> runc:
> Version: 1.1.10
> GitCommit: v1.1.10-0-g18a0cb0
> docker-init:
> Version: 0.19.0
> GitCommit: de40ad0
>
> Thanks,
> Marco
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io