[ceph-users] Re: Replace HDD with cephadm

2022-03-16 Thread Kai Stian Olstad

On 15.03.2022 10:10, Jimmy Spets wrote:

Thanks for your reply.
I have two things that I am unsure of:
- Is the OSD UUID the same for all OSD:s or should it be unique for 
each?


It's unique and generated when you run ceph-volume lvm prepare or add an 
OSD.


You can find OSD UUID/FSID for existing OSD in /var/lib/ceph/FSID>/osd./fsid



- Have I understood correctly that in your example the OSD is not 
encrypted?


Yes, it's not encrypted.

--
Kai Stian
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Replace HDD with cephadm

2022-03-11 Thread Kai Stian Olstad

On 10.03.2022 14:48, Jimmy Spets wrote:

I have a Ceph Pacific cluster managed by cephadm.

The nodes have six HDD:s and one NVME that is shared between the six
HDD:s.

The OSD spec file looks like this:

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  rotational: 1
db_devices:
  rotational: 0
  size: '800G:1200G'
db_slots: 6
encrypted: true

I need to replace one of the HDD:s that is broken.

How do I replace the HDD in the OSD connecting it to the old HDD:s
db_slot?


Last time I tried, cephadm could not replace a disk where the db was on 
a separate drive.

It would just add is as a new OSD without the db on a separate disk.
So to avoid this, remove all the active OSD spec so the disk wont be 
added automatically by cephadm.

Then you need to manual add the disk.
This is unfortunately not described anywhere, but the procedure I follow 
is this and the osd is osd.152



Find the VG og LV of the block db for the OSD.
  root@osd-host:~# ls -l /var/lib/ceph/*/osd.152/block.db
  lrwxrwxrwx 1 167 167 90 Dec  1 12:58 
/var/lib/ceph/b321e76e-da3a-11eb-b75c-4f948441dcd0/osd.152/block.db -> 
/dev/ceph-10215920-77ea-4d50-b153-162477116b4c/osd-db-25762869-20d5-49b1-9ff4-378af8f679c4


  VG = ceph-10215920-77ea-4d50-b153-162477116b4c
  LV = osd-db-25762869-20d5-49b1-9ff4-378af8f679c4

If you have already removed it, you'll find it in 
/var/lib/ceph/*/removed/



Then you remove the OSD.
  root@admin:~# ceph orch osd rm 152 --replace
  Scheduled OSD(s) for removal

When the disk is removed from Ceph you can replace it with a new one.
Look in dmesg what the new disk is named, in my case it's /dev/sdt


Prepare the new disk

  root@osd-host:~# cephadm shell

  root@osd-host:/# ceph auth get client.bootstrap-osd 
>/var/lib/ceph/bootstrap-osd/ceph.keyring

  exported keyring for client.bootstrap-osd

  # Here you need to use the VG/LV you found above so you can reuse the 
db volume.
  root@osd-host:~# ceph-volume lvm prepare --bluestore --no-systemd 
--osd-id 152 --data /dev/sdt --block.db 
ceph-10215920-77ea-4d50-b153-162477116b4c/osd-db-25762869-20d5-49b1-9ff4-378af8f679c4

  < removed some output >
  Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore 
bluestore --mkfs -i 152 --monmap 
/var/lib/ceph/osd/ceph-152/activate.monmap --keyfile - 
--bluestore-block-db-path 
/dev/ceph-10215920-77ea-4d50-b153-162477116b4c/osd-db-25762869-20d5-49b1-9ff4-378af8f679c4 
--osd-data /var/lib/ceph/osd/ceph-152/ --osd-uuid 
517213f3-0715-4d23-8103-6a34b1f8ef08 --setuser ceph --setgroup ceph
 stderr: 2021-12-01T11:50:33.613+ 7ff013614080 -1 
bluestore(/var/lib/ceph/osd/ceph-152/) _read_fsid unparsable uuid

  --> ceph-volume lvm prepare successful for: /dev/sdt

Here you need the --osd-uuid which is 
517213f3-0715-4d23-8103-6a34b1f8ef08



Then you need a json file containing ceph info and osd authentication, 
this file can be created like this
  root@admin:~# printf '{\n"config": "%s",\n"keyring": "%s"\n}\n' 
"$(ceph config generate-minimal-conf | sed -e ':a;N;$!ba;s/\n/\\n/g' -e 
's/\t/\\t/g' -e 's/$/\\n/')" "$(ceph auth get osd.152 | head -n 2 | sed 
-e ':a;N;$!ba;s/\n/\\n/g' -e 's/\t/\\t/g' -e 's/$/\\n/')" 
>config-osd.152.json
You might need to copy the json file to the OSD-host depending on where 
you run the command.


The --osd-uuid above is the same at --osd-fsid in this command, thank 
you for consistent naming.
  root@osd-host:~# cephadm deploy --fsid  --name osd.152 
--config-json config-osd.152.json --osd-fsid 
517213f3-0715-4d23-8103-6a34b1f8ef08


And then the OSD should be back up and running.

This is the way I have found to do OSD replacement, it might be an 
easier way of doing it but I have not found that.



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io