On 3/22/21 3:52 PM, Nico Schottelius wrote:

Hello,

follow up from my mail from 2020 [0], it seems that OSDs sometimes have
"multiple classes" assigned:

[15:47:15] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush 
rm-device-class osd.4
done removing class of osd(s): 4
[15:47:17] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush 
rm-device-class osd.4
osd.4 belongs to no class,
[15:47:20] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush 
set-device-class xruk osd.4
set osd(s) 4 to class 'xruk'
[15:47:45] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush 
set-device-class xruk osd.4
osd.4 already set to class xruk. set-device-class item id 4 name 'osd.4' 
device_class 'xruk': no change.
[15:47:47] server6.place6:/var/lib/ceph/osd/ceph-4# /usr/bin/ceph-osd -i 4 
--pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph 
--setuser ceph --setgroup ceph

2021-03-22 15:48:02.773 7fe2f81e4d80 -1 osd.4 94608 log_to_monitors 
{default=true}
2021-03-22 15:48:02.777 7fe2f81e4d80 -1 osd.4 94608 mon_cmd_maybe_osd_create fail: 
'osd.4 has already bound to class 'xruk', can not reset class to 'hdd'; use 'ceph osd 
crush rm-device-class <id>' to remove old class first': (16) Device or resource 
busy
[15:48:02] server6.place6:/var/lib/ceph/osd/ceph-4#
[15:48:02] server6.place6:/var/lib/ceph/osd/ceph-4#

We are running ceph 14.2.9.

As written before, it also seems that the affected OSD is peering with
OSDs from the wrong class (hdd). Does anyone have a hint on how to fix
this?

Do you have: osd_class_update_on_start enabled?

On our cluster NVMe OSDs would try to wrongly add themselves to "SSD" class (which didn't succeed). But maybe sometimes your OSDs do manage to put themselve in a wrong class? Just guessing. But I would turn that off. The same for this parameter:

osd_crush_update_on_start

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to