[ceph-users] Re: Missing OSD in SSD after disk failure
56110ad" stderr: Creating volume group backup "/etc/lvm/backup/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" (seqno 6). stdout: Logical volume "osd-block-42278e28-5274-4167-a014-6a6a956110ad" successfully removed stderr: Removing physical volume "/dev/sdc" from volume group "ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" stdout: Volume group "ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" successfully removed --> Zapping successful for OSD: 7 after that, the command: ceph-volume lvm list shows on osd.0 the same as above, but nothing about osd.7 # refresh devices ceph orch device ls --refresh HOST PATH TYPE SIZE DEVICE_ID MODEL VENDOR ROTATIONAL AVAIL REJECT REASONS nubceph04 /dev/sda hdd 19.0G Virtual disk VMware 1 False locked nubceph04 /dev/sdb hdd 20.0G Virtual disk VMware 1 False locked, Insufficient space (<5GB) on vgs, LVM detected nubceph04 /dev/sdc hdd 20.0G Virtual disk VMware 1 True nubceph04 /dev/sdd hdd 10.0G Virtual disk VMware 1 False locked, LVM detected After some time, recreates osd.7 but without db_device # monitor ceph for replacement ceph -W cephadm .. 2021-08-30T18:11:22.439190-0300 mgr.nubvm02.viqmmr [INF] Deploying daemon osd.7 on nubceph04 .. Wait until it finishes rebalancing. If I run again: ceph-volume lvm list shows for each osd, which disks and lvs are in use (snipped): == osd.0 === [block] /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id0 devices /dev/sdb [db] /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 block device /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id0 devices /dev/sdd == osd.7 === [block] /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad block device /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad osd id7 devices /dev/sdc It seems it didn't create the lv for "ceph-block-dbs" as it had before If I run everything again but with osd.0, it creates correctly, because when running: ceph-volume lvm zap --osd-id 0 --destroy It doesn't say this line: --> More than 1 LV left in VG, will proceed to destroy LV only But it rather says this: --> Only 1 LV left in VG, will proceed to destroy volume group As far as I can tell, if the disk is not empty it just doesn't use it. Let me know if I wasn't clear enough. Best regards, Eric ____ From: David Orman Sent: Monday, August 30, 2021 1:14 PM To: Eric Fahnle Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Missing OSD in SSD after disk failure I may have misread your original email, for which I apologize. If you do a 'ceph orch device ls' does the NVME in question show available? On that host with the failed OSD, if you lvs/lsblk do you see the old DB on the NVME still? I'm not sure if the replacement process you followed will work. Here's what we do on OSD pre-failure as well as failures on nodes with NVME backing the OSD for DB/WAL: In cephadm shell, on host with drive to replace (in this example, let's say 391 on a node called ceph15): # capture "db device" and raw device associated with OSD (just for safety) ceph-volume lvm list | less # drain drive if possible, do this when planning replacement, otherwise do once failure has occurred ceph orch osd rm 391 --replace # One drained (or if failure occurred) (we don't use the orch version yet because we've had issues with it) ceph-volume lvm zap --osd-id 391 --destroy # refresh devices ceph orch device ls --refresh # monitor ceph for replacement ceph -W cephadm # once daemon has been deployed "2021-03-25T18:03:16.742483+ mgr.ceph02.duoetc [INF] Deploying daemon osd.391 on ceph15", watch for rebalance to complete ceph -s # consider increasing max_backfills if it's just a single drive replacement: ceph config set osd osd_max_backfills 10 # if you do, after backfilling is complete (validate with 'ceph -s'): ceph config rm osd osd_max_backfills The lvm zap cleans up the db/wal LV, which allows for the replacement drive to rebuild with db/wal on the NVME. Hope this helps, David On Fri,
[ceph-users] Re: Missing OSD in SSD after disk failure
I may have misread your original email, for which I apologize. If you do a 'ceph orch device ls' does the NVME in question show available? On that host with the failed OSD, if you lvs/lsblk do you see the old DB on the NVME still? I'm not sure if the replacement process you followed will work. Here's what we do on OSD pre-failure as well as failures on nodes with NVME backing the OSD for DB/WAL: In cephadm shell, on host with drive to replace (in this example, let's say 391 on a node called ceph15): # capture "db device" and raw device associated with OSD (just for safety) ceph-volume lvm list | less # drain drive if possible, do this when planning replacement, otherwise do once failure has occurred ceph orch osd rm 391 --replace # One drained (or if failure occurred) (we don't use the orch version yet because we've had issues with it) ceph-volume lvm zap --osd-id 391 --destroy # refresh devices ceph orch device ls --refresh # monitor ceph for replacement ceph -W cephadm # once daemon has been deployed "2021-03-25T18:03:16.742483+ mgr.ceph02.duoetc [INF] Deploying daemon osd.391 on ceph15", watch for rebalance to complete ceph -s # consider increasing max_backfills if it's just a single drive replacement: ceph config set osd osd_max_backfills 10 # if you do, after backfilling is complete (validate with 'ceph -s'): ceph config rm osd osd_max_backfills The lvm zap cleans up the db/wal LV, which allows for the replacement drive to rebuild with db/wal on the NVME. Hope this helps, David On Fri, Aug 27, 2021 at 7:21 PM Eric Fahnle wrote: > > Hi David! Very much appreciated your response. > > I'm not sure that may be the problem. I tried with the following (without > using "rotational"): > > ...(snip)... > data_devices: >size: "15G:" > db_devices: >size: ":15G" > filter_logic: AND > placement: > label: "osdj2" > service_id: test_db_device > service_type: osd > ...(snip)... > > Without success. Also tried without the "filter_logic: AND" in the yaml file > and the result was the same. > > Best regards, > Eric > > > -Original Message- > From: David Orman [mailto:orma...@corenode.com] > Sent: 27 August 2021 14:56 > To: Eric Fahnle > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Missing OSD in SSD after disk failure > > This was a bug in some versions of ceph, which has been fixed: > > https://tracker.ceph.com/issues/49014 > https://github.com/ceph/ceph/pull/39083 > > You'll want to upgrade Ceph to resolve this behavior, or you can use size or > something else to filter if that is not possible. > > David > > On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle wrote: > > > > Hi everyone! > > I've got a doubt, I tried searching for it in this list, but didn't find an > > answer. > > > > I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The > > deployment was done with "ceph orch apply deploy-osd.yaml", in which the > > file "deploy-osd.yaml" contained the following: > > --- > > service_type: osd > > service_id: default_drive_group > > placement: > > label: "osd" > > data_devices: > > rotational: 1 > > db_devices: > > rotational: 0 > > > > After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, > > plus the DB. > > > > A few days ago, an HDD broke and got replaced. Ceph detected the new disk > > and created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in > > that server has 3 OSDs running but didn't add the new one. I couldn't find > > out how to re-create the OSD with the exact configuration it had before. > > The only "way" I found was to delete all 4 OSDs and create everything from > > scratch (I didn't actually do it, as I hope there is a better way). > > > > Has anyone had this issue before? I'd be glad if someone pointed me in the > > right direction. > > > > Currently running: > > Version > > 15.2.8 > > octopus (stable) > > > > Thank you in advance and best regards, Eric > > ___ > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
Hi David! Very much appreciated your response. I'm not sure that may be the problem. I tried with the following (without using "rotational"): ...(snip)... data_devices: size: "15G:" db_devices: size: ":15G" filter_logic: AND placement: label: "osdj2" service_id: test_db_device service_type: osd ...(snip)... Without success. Also tried without the "filter_logic: AND" in the yaml file and the result was the same. Best regards, Eric -Original Message- From: David Orman [mailto:orma...@corenode.com] Sent: 27 August 2021 14:56 To: Eric Fahnle Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Missing OSD in SSD after disk failure This was a bug in some versions of ceph, which has been fixed: https://tracker.ceph.com/issues/49014 https://github.com/ceph/ceph/pull/39083 You'll want to upgrade Ceph to resolve this behavior, or you can use size or something else to filter if that is not possible. David On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle wrote: > > Hi everyone! > I've got a doubt, I tried searching for it in this list, but didn't find an > answer. > > I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The > deployment was done with "ceph orch apply deploy-osd.yaml", in which the file > "deploy-osd.yaml" contained the following: > --- > service_type: osd > service_id: default_drive_group > placement: > label: "osd" > data_devices: > rotational: 1 > db_devices: > rotational: 0 > > After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, > plus the DB. > > A few days ago, an HDD broke and got replaced. Ceph detected the new disk and > created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that > server has 3 OSDs running but didn't add the new one. I couldn't find out how > to re-create the OSD with the exact configuration it had before. The only > "way" I found was to delete all 4 OSDs and create everything from scratch (I > didn't actually do it, as I hope there is a better way). > > Has anyone had this issue before? I'd be glad if someone pointed me in the > right direction. > > Currently running: > Version > 15.2.8 > octopus (stable) > > Thank you in advance and best regards, Eric > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
This was a bug in some versions of ceph, which has been fixed: https://tracker.ceph.com/issues/49014 https://github.com/ceph/ceph/pull/39083 You'll want to upgrade Ceph to resolve this behavior, or you can use size or something else to filter if that is not possible. David On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle wrote: > > Hi everyone! > I've got a doubt, I tried searching for it in this list, but didn't find an > answer. > > I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The > deployment was done with "ceph orch apply deploy-osd.yaml", in which the file > "deploy-osd.yaml" contained the following: > --- > service_type: osd > service_id: default_drive_group > placement: > label: "osd" > data_devices: > rotational: 1 > db_devices: > rotational: 0 > > After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, > plus the DB. > > A few days ago, an HDD broke and got replaced. Ceph detected the new disk and > created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that > server has 3 OSDs running but didn't add the new one. I couldn't find out how > to re-create the OSD with the exact configuration it had before. The only > "way" I found was to delete all 4 OSDs and create everything from scratch (I > didn't actually do it, as I hope there is a better way). > > Has anyone had this issue before? I'd be glad if someone pointed me in the > right direction. > > Currently running: > Version > 15.2.8 > octopus (stable) > > Thank you in advance and best regards, > Eric > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
Targets: block.db Total size: 9.00 GB /usr/bin/podman:stdout Total LVs: 2 Size per LV: 4.50 GB /usr/bin/podman:stdout Devices: /dev/sdd /usr/bin/podman:stdout /usr/bin/podman:stdout TypePath LV Size % of device /usr/bin/podman:stdout /usr/bin/podman:stdout [data] /dev/sdb 19.00 GB100.0% /usr/bin/podman:stdout [block.db] vg: vg/lv 4.50 GB 50% /usr/bin/podman:stdout /usr/bin/podman:stdout [data] /dev/sdc 19.00 GB100.0% /usr/bin/podman:stdout [block.db] vg: vg/lv 4.50 GB 50% Total OSDs: 2 Solid State VG: Targets: block.db Total size: 9.00 GB Total LVs: 2 Size per LV: 4.50 GB Devices: /dev/sdd TypePathLV Size % of device [data] /dev/sdb19.00 GB100.0% [block.db] vg: vg/lv 4.50 GB 50% [data] /dev/sdc19.00 GB100.0% [block.db] vg: vg/lv 4.50 GB 50% My conclusion (I may be wrong) is that: -if the disk where the block.db will be placed is not 100% empty, then it does not use it. My question, then, is: Is there a way to recreate/create a new osd that has a block.db in a disk which is not completely empty? Answering your questions one by one 1. Can you check what ceph-volume would do if you did it manually? All answered above 2. One more question, did you properly wipe the previous LV on that NVMe? If I'm not mistaken, I did exactly that when I ran the command: ceph orch device zap nubceph04 /dev/ceph-block-dbs-8b159f55-2500-427f-9743-2bb8b3df3e17/osd-block-db-b1e2a81f-2fc9-4786-85d2-6a27430d9f2e --force 3. You should also have some logs available from the deployment attempt, maybe it reveals why the NVMe was not considered. I couldn't find any relevant logs regarding this question. Best regards, Eric -Original Message- From: Eugen Block [mailto:ebl...@nde.ag] Sent: 24 August 2021 05:07 To: ceph-users@ceph.io Subject: [ceph-users] Re: Missing OSD in SSD after disk failure Can you check what ceph-volume would do if you did it manually? Something like this host1:~ # cephadm ceph-volume lvm batch --report /dev/vdc /dev/vdd --db-devices /dev/vdb and don't forget the '--report' flag. One more question, did you properly wipe the previous LV on that NVMe? You should also have some logs available from the deployment attempt, maybe it reveals why the NVMe was not considered. Zitat von Eric Fahnle : > Hi Eugen, thanks for the reply. > > I've already tried what you wrote in your answer, but still no luck. > > The NVMe disk still doesn't have the OSD. Please note I using > containers, not standalone OSDs. > > Any ideas? > > Regards, > Eric > > > Message: 2 > Date: Fri, 20 Aug 2021 06:56:59 + > From: Eugen Block > Subject: [ceph-users] Re: Missing OSD in SSD after disk failure > To: ceph-users@ceph.io > Message-ID: > <20210820065659.horde.azw9ev10u5ynqkwjpuyr...@webmail.nde.ag> > Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes > > Hi, > > this seems to be a reoccuring issue, I had the same just yesterday in > my lab environment running on 15.2.13. If I don't specify other > criteria in the yaml file then I'll end up with standalone OSDs > instead of the desired rocksDB on SSD. Maybe this is still a bug, I > didn't check. My workaround is this spec file: > > ---snip--- > block_db_size: 4G > data_devices: >size: "20G:" >rotational: 1 > db_devices: >size: "10G" >rotational: 0 > filter_logic: AND > placement: >hosts: >- host4 >- host3 >- host1 >- host2 > service_id: default > service_type: osd > ---snip--- > > If you apply the new spec file, then destroy and zap the standalone > OSD I believe the orchestrator should redeploy it correctly, it did in > my case. But as I said, this
[ceph-users] Re: Missing OSD in SSD after disk failure
Can you check what ceph-volume would do if you did it manually? Something like this host1:~ # cephadm ceph-volume lvm batch --report /dev/vdc /dev/vdd --db-devices /dev/vdb and don't forget the '--report' flag. One more question, did you properly wipe the previous LV on that NVMe? You should also have some logs available from the deployment attempt, maybe it reveals why the NVMe was not considered. Zitat von Eric Fahnle : Hi Eugen, thanks for the reply. I've already tried what you wrote in your answer, but still no luck. The NVMe disk still doesn't have the OSD. Please note I using containers, not standalone OSDs. Any ideas? Regards, Eric Message: 2 Date: Fri, 20 Aug 2021 06:56:59 + From: Eugen Block Subject: [ceph-users] Re: Missing OSD in SSD after disk failure To: ceph-users@ceph.io Message-ID: <20210820065659.horde.azw9ev10u5ynqkwjpuyr...@webmail.nde.ag> Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Hi, this seems to be a reoccuring issue, I had the same just yesterday in my lab environment running on 15.2.13. If I don't specify other criteria in the yaml file then I'll end up with standalone OSDs instead of the desired rocksDB on SSD. Maybe this is still a bug, I didn't check. My workaround is this spec file: ---snip--- block_db_size: 4G data_devices: size: "20G:" rotational: 1 db_devices: size: "10G" rotational: 0 filter_logic: AND placement: hosts: - host4 - host3 - host1 - host2 service_id: default service_type: osd ---snip--- If you apply the new spec file, then destroy and zap the standalone OSD I believe the orchestrator should redeploy it correctly, it did in my case. But as I said, this is just a small lab environment. Regards, Eugen ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
Hi Eugen, thanks for the reply. I've already tried what you wrote in your answer, but still no luck. The NVMe disk still doesn't have the OSD. Please note I using containers, not standalone OSDs. Any ideas? Regards, Eric Message: 2 Date: Fri, 20 Aug 2021 06:56:59 + From: Eugen Block Subject: [ceph-users] Re: Missing OSD in SSD after disk failure To: ceph-users@ceph.io Message-ID: <20210820065659.horde.azw9ev10u5ynqkwjpuyr...@webmail.nde.ag> Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Hi, this seems to be a reoccuring issue, I had the same just yesterday in my lab environment running on 15.2.13. If I don't specify other criteria in the yaml file then I'll end up with standalone OSDs instead of the desired rocksDB on SSD. Maybe this is still a bug, I didn't check. My workaround is this spec file: ---snip--- block_db_size: 4G data_devices: size: "20G:" rotational: 1 db_devices: size: "10G" rotational: 0 filter_logic: AND placement: hosts: - host4 - host3 - host1 - host2 service_id: default service_type: osd ---snip--- If you apply the new spec file, then destroy and zap the standalone OSD I believe the orchestrator should redeploy it correctly, it did in my case. But as I said, this is just a small lab environment. Regards, Eugen ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
Hi Eugen, thanks for the reply. I've already tried what you wrote in your answer, but still no luck. The NVMe disk still doesn't have the OSD. Please note I using containers, not standalone OSDs. Any ideas? Regards, Eric ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Missing OSD in SSD after disk failure
Hi, this seems to be a reoccuring issue, I had the same just yesterday in my lab environment running on 15.2.13. If I don't specify other criteria in the yaml file then I'll end up with standalone OSDs instead of the desired rocksDB on SSD. Maybe this is still a bug, I didn't check. My workaround is this spec file: ---snip--- block_db_size: 4G data_devices: size: "20G:" rotational: 1 db_devices: size: "10G" rotational: 0 filter_logic: AND placement: hosts: - host4 - host3 - host1 - host2 service_id: default service_type: osd ---snip--- If you apply the new spec file, then destroy and zap the standalone OSD I believe the orchestrator should redeploy it correctly, it did in my case. But as I said, this is just a small lab environment. Regards, Eugen Zitat von Eric Fahnle : Hi everyone! I've got a doubt, I tried searching for it in this list, but didn't find an answer. I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The deployment was done with "ceph orch apply deploy-osd.yaml", in which the file "deploy-osd.yaml" contained the following: --- service_type: osd service_id: default_drive_group placement: label: "osd" data_devices: rotational: 1 db_devices: rotational: 0 After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, plus the DB. A few days ago, an HDD broke and got replaced. Ceph detected the new disk and created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that server has 3 OSDs running but didn't add the new one. I couldn't find out how to re-create the OSD with the exact configuration it had before. The only "way" I found was to delete all 4 OSDs and create everything from scratch (I didn't actually do it, as I hope there is a better way). Has anyone had this issue before? I'd be glad if someone pointed me in the right direction. Currently running: Version 15.2.8 octopus (stable) Thank you in advance and best regards, Eric ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io