[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD
yes Am 11.03.21 um 15:46 schrieb Kai Stian Olstad: > Hi Sebastian > > On 11.03.2021 13:13, Sebastian Wagner wrote: >> looks like >> >> $ ssh pech-hd-009 >> # cephadm ls >> >> is returning this non-existent OSDs. >> >> can you verify that `cephadm ls` on that host doesn't >> print osd.355 ? > > "cephadm ls" on the node does list this drive > > { > "style": "cephadm:v1", > "name": "osd.355", > "fsid": "3614abcc-201c-11eb-995a-2794bcc75ae0", > "systemd_unit": "ceph-3614abcc-201c-11eb-995a-2794bcc75ae0@osd.355", > "enabled": true, > "state": "stopped", > "container_id": null, > "container_image_name": > "goharbor.example.com/library/ceph/ceph:v15.2.5", > "container_image_id": null, > "version": null, > "started": null, > "created": "2021-01-20T09:53:22.229080", > "deployed": "2021-02-09T09:24:02.855576", > "configured": "2021-02-09T09:24:04.211587" > } > > > To resolve it, could I just remove it with "cephadm rm-daemon"? > -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD
Hi Kai, looks like $ ssh pech-hd-009 # cephadm ls is returning this non-existent OSDs. can you verify that `cephadm ls` on that host doesn't print osd.355 ? Best, Sebastian Am 11.03.21 um 12:16 schrieb Kai Stian Olstad: > Before I started the upgrade the cluster was healthy but one > OSD(osd.355) was down, can't remember if it was in or out. > Upgrade was started with > ceph orch upgrade start --image > goharbor.example.com/library/ceph/ceph:v15.2.9 > > The upgrade started but when Ceph tried to upgrade osd.355 it paused > with the following messages: > > 2021-03-11T09:15:35.638104+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Target is goharbor.example.com/library/ceph/ceph:v15.2.9 with id > dfc48307963697ff48acd9dd6fda4a7a24017b9d8124f86c2 > a542b0802fe77ba > 2021-03-11T09:15:35.639882+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking mgr daemons... > 2021-03-11T09:15:35.644170+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All mgr daemons are up to date. > 2021-03-11T09:15:35.644376+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking mon daemons... > 2021-03-11T09:15:35.647669+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All mon daemons are up to date. > 2021-03-11T09:15:35.647866+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking crash daemons... > 2021-03-11T09:15:35.652035+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Setting container_image for all crash... > 2021-03-11T09:15:35.653683+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All crash daemons are up to date. > 2021-03-11T09:15:35.653896+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking osd daemons... > 2021-03-11T09:15:36.273345+ mgr.pech-mon-2.cjeiyc [INF] It is > presumed safe to stop ['osd.355'] > 2021-03-11T09:15:36.273504+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > It is presumed safe to stop ['osd.355'] > 2021-03-11T09:15:36.273887+ mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Redeploying osd.355 > 2021-03-11T09:15:36.276673+ mgr.pech-mon-2.cjeiyc [ERR] Upgrade: > Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.355 on host > pech-hd-009 failed. > > > One of the first ting the upgrade did was to upgrade mon, so they are > restarted and now the osd.355 no longer exist > > $ ceph osd info osd.355 > Error EINVAL: osd.355 does not exist > > But if I run a resume > ceph orch upgrade resume > it still tries to upgrade osd.355, same message as above. > > I tried to stop and start the upgrade again with > ceph orch upgrade stop > ceph orch upgrade start --image > goharbor.example.com/library/ceph/ceph:v15.2.9 > it still tries to upgrade osd.355, with the same message as above. > > Looking at the source code it looks like it get daemons to upgrade from > mgr cache, so I restarted both mgr but still it tries to upgrade osd.355. > > > Does anyone know how I can get the upgrade to continue? > > -- > Kai Stian Olstad > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD
On 11.03.2021 15:47, Sebastian Wagner wrote: yes Am 11.03.21 um 15:46 schrieb Kai Stian Olstad: To resolve it, could I just remove it with "cephadm rm-daemon"? That worked like a charm, and the upgrade is resumed. Thank you Sebastian. -- Kai Stian Olstad ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD
Hi Sebastian On 11.03.2021 13:13, Sebastian Wagner wrote: looks like $ ssh pech-hd-009 # cephadm ls is returning this non-existent OSDs. can you verify that `cephadm ls` on that host doesn't print osd.355 ? "cephadm ls" on the node does list this drive { "style": "cephadm:v1", "name": "osd.355", "fsid": "3614abcc-201c-11eb-995a-2794bcc75ae0", "systemd_unit": "ceph-3614abcc-201c-11eb-995a-2794bcc75ae0@osd.355", "enabled": true, "state": "stopped", "container_id": null, "container_image_name": "goharbor.example.com/library/ceph/ceph:v15.2.5", "container_image_id": null, "version": null, "started": null, "created": "2021-01-20T09:53:22.229080", "deployed": "2021-02-09T09:24:02.855576", "configured": "2021-02-09T09:24:04.211587" } To resolve it, could I just remove it with "cephadm rm-daemon"? -- Kai Stian Olstad ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io