subject:"\[ceph\-users\] Re\: ceph orchestator pulls strange images from docker.io"

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Eugen Block

Hi,

someone else had a similar issue [1], to set the global container
image you can run:

$ ceph config set global container_image my-registry:5000/ceph/ceph:v17.2.6

I usually change that as soon as a cluster is up and running or after
an upgrade so there's no risk of pulling wrong container images (I
assume in your case the local cephadm versions on the hosts differ and
therefore each one pulls a different default image hard-coded in the
cephadm binary).

You should probably be able to start a mgr daemon by changing the
unit.run file temporarily and replace "CONTAINER_IMAGE" with a correct
image version (stop the pod first):

CONTAINER_IMAGE=my-registry/ceph/ceph-quincy@v17.2.6 (this is just an
example).

The same line contains another image reference which you should
change. Then restart that pod (e. g. with systemctl), hopefully you'll
have a MGR up and running to be able to use the orchestrator again.
This procedure helped me in the past.

Regards,
Eugen

[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/THAH2JFQNB7B4BPUHTRDPGXJ75WPNSNK/

Zitat von Stefan Kooman :

On 15-09-2023 10:25, Stefan Kooman wrote:

I could just nuke the whole dev cluster, wipe all disks and start
fresh after reinstalling the hosts, but as I have to adopt 17
clusters to the orchestrator, I rather get some learnings from the
not working thing 

There is actually a cephadm "kill it with fire" option to do that
for you, but yeah, make sure you know how to fix it when things do
not go according to plan. It all magically works, until it doesn't .

cephadm rm-cluster --fsid your-fsid-here --force

... ss a last resort (short of wipefs / shred on all disks).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Stefan Kooman


On 15-09-2023 10:25, Stefan Kooman wrote:


I could just nuke the whole dev cluster, wipe all disks and start 
fresh after reinstalling the hosts, but as I have to adopt 17 clusters 
to the orchestrator, I rather get some learnings from the not working 
thing 


There is actually a cephadm "kill it with fire" option to do that for 
you, but yeah, make sure you know how to fix it when things do not go 
according to plan. It all magically works, until it doesn't .



cephadm rm-cluster --fsid your-fsid-here --force

... ss a last resort (short of wipefs / shred on all disks).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Stefan Kooman

On 15-09-2023 09:21, Boris Behrens wrote:

Hi Stefan,

the cluster is running 17.6.2 through the board. The mentioned container
with other version don't show in the ceph -s or ceph verions.

It looks like it is host related.
One host get the correct 17.2.6 images, one get the 16.2.11 images and
the third one uses the 7.0.0-7183-g54142666 (whatever this is) images.

root@0cc47a6df330:~# ceph config-key get config/global/container_image
Error ENOENT:

root@0cc47a6df330:~# ceph config-key list |grep container_image
"config-history/12/+mgr.0cc47a6df14e/container_image",
"config-history/13/+mgr.0cc47aad8ce8/container_image",
"config/mgr.0cc47a6df14e/container_image",
"config/mgr.0cc47aad8ce8/container_image",

I've tried to set the detault image to ceph config-key set
config/global/container_image
quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232

But I can not redeploy the mgr daemons, because there is no standby daemon.

root@0cc47a6df330:~# ceph orch redeploy mgr
Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No
standby MGR

But there should be:
root@0cc47a6df330:~# ceph orch ps
NAME HOST PORTS STATUS
REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID
CONTAINER ID
mgr.0cc47a6df14e.iltiot 0cc47a6df14e *:9283 running (23s) 22s ago
2m 10.6M - 16.2.11 de4b0b384ad4 0f31a162fa3e
mgr.0cc47aad8ce8 0cc47aad8ce8 running (16h) 8m ago
16h 591M - 17.2.6 22cd8daf4d70 8145c63fdc44

I guess that one of the managers is not working correctly (probably the
16.2.11 version). IIRC I have changed the image reference for a
container (systemd unit files) once, when I managed to redeploy all
containers with a non-working image (test setup). so first make sure
what manager is actually running, then try to fix the other one by
editing the relevant config for that container (point it to the same
image as the running container). Pull necessary image first if need be.
After you've got a standby manager up and running, you can redeploy the
necessary daemons. Be careful ... there are commands that redeploy all
daemons at the same time, you don't want to do that normally ;-).

root@0cc47a6df330:~# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
mgr 2/2 8m ago 19h 0cc47a6df14e;0cc47a6df330;0cc47aad8ce8

I've also remove podman and containerd, kill all directories and then do
a fresh reinstall of podman, which also did not work.
It's also strange that the daemons with the wonky version got an extra
suffix.

If I would now how, I would happily nuke the whole orchestrator, podman
and everything that goes along with it, and start over. In the end it is
not that hard to start some mgr/mon daemons without podman, so I would
be back to a classical cluster.
I tried this yesterday, but the daemons still use that very strange
images and I just don't understand why.

I could just nuke the whole dev cluster, wipe all disks and start fresh
after reinstalling the hosts, but as I have to adopt 17 clusters to the
orchestrator, I rather get some learnings from the not working thing :)

There is actually a cephadm "kill it with fire" option to do that for
you, but yeah, make sure you know how to fix it when things do not go
according to plan. It all magically works, until it doesn't ;-).

Good luck, and keep us updated with any further challenges / progress.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Boris Behrens

Hi Stefan,

the cluster is running 17.6.2 through the board. The mentioned container
with other version don't show in the ceph -s or ceph verions.
It looks like it is host related.
One host get the correct 17.2.6 images, one get the 16.2.11 images and the
third one uses the 7.0.0-7183-g54142666 (whatever this is) images.

root@0cc47a6df330:~# ceph config-key get config/global/container_image
Error ENOENT:

root@0cc47a6df330:~# ceph config-key list |grep container_image
"config-history/12/+mgr.0cc47a6df14e/container_image",
"config-history/13/+mgr.0cc47aad8ce8/container_image",
"config/mgr.0cc47a6df14e/container_image",
"config/mgr.0cc47aad8ce8/container_image",

I've tried to set the detault image to ceph config-key set
config/global/container_image
quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
But I can not redeploy the mgr daemons, because there is no standby daemon.

root@0cc47a6df330:~# ceph orch redeploy mgr
Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No standby
MGR

But there should be:
root@0cc47a6df330:~# ceph orch ps
NAME HOST PORTS   STATUS
  REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID  CONTAINER
ID
mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)22s ago
2m10.6M-  16.2.11de4b0b384ad4  0f31a162fa3e
mgr.0cc47aad8ce8 0cc47aad8ce8  running (16h) 8m ago
 16h 591M-  17.2.6 22cd8daf4d70  8145c63fdc44

root@0cc47a6df330:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  2/2  8m ago 19h  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8

I've also remove podman and containerd, kill all directories and then do a
fresh reinstall of podman, which also did not work.
It's also strange that the daemons with the wonky version got an extra
suffix.

If I would now how, I would happily nuke the whole orchestrator, podman and
everything that goes along with it, and start over. In the end it is not
that hard to start some mgr/mon daemons without podman, so I would be back
to a classical cluster.
I tried this yesterday, but the daemons still use that very strange images
and I just don't understand why.

I could just nuke the whole dev cluster, wipe all disks and start fresh
after reinstalling the hosts, but as I have to adopt 17 clusters to the
orchestrator, I rather get some learnings from the not working thing :)

Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman :

> On 14-09-2023 17:49, Boris Behrens wrote:
> > Hi,
> > I currently try to adopt our stage cluster, some hosts just pull strange
> > images.
> >
> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
> > CONTAINER ID  IMAGE   COMMAND
> >  CREATEDSTATUSPORTS   NAMES
> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
> >
> > root@0cc47a6df330:~# ceph orch ps
> > NAME HOST PORTS   STATUS
> >REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID
> >   CONTAINER ID
> > mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283
> running
> > (3m)  3m ago   3m10.8M-  16.2.11
> >   de4b0b384ad4  00b02cd82a1c
> > mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283
> running
> > (5s)  2s ago   4s10.5M-  17.0.0-7183-g54142666
> >   75e3d7089cea  662c6baa097e
> > mgr.0cc47aad8ce8 0cc47aad8ce8.f00f.gridscale.dev
> running
> > (65m) 8m ago  60m 553M-  17.2.6
> > 22cd8daf4d70  8145c63fdc44
> >
> > Any idea what I need to do to change that?
>
> I want to get some things cleared up. What is the version you are
> running? I see three different ceph versions active now. I see you are
> running a podman ps command, but see docker images pulled. AFAIK podman
> needs a different IMAGE than docker ... or do you have a mixed setup?
>
> What does "ceph config-key get config/global/container_image" give you?
>
> ceph config-key list |grep container_image should give you a list
> (including config-history) where you can see what has been configured
> before.
>
> cephadm logs might give a clue as well.
>
> You can configure the IMAGE version / type that you want by setting the
> key and redeploy affected containers: For example (18.1.2):
>
> ceph config-key set config/global/container_image
>
> quay.io/ceph/ceph:v18.1.2@sha256:82a380c8127c42da406b7ce1281c2f3c0a86d4ba04b1f4b5f8d1036b8c24784f
>
> Gr. Stefan
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Marc

> > I currently try to adopt our stage cluster, some hosts just pull strange
> > images.
> >
> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
> > CONTAINER ID  IMAGE   COMMAND
> >  CREATEDSTATUSPORTS   NAMES
> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
> >
> > root@0cc47a6df330:~# ceph orch ps
> > NAME HOST PORTS   STATUS
> >REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID
> >   CONTAINER ID
> > mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283  running
> > (3m)  3m ago   3m10.8M-  16.2.11
> >   de4b0b384ad4  00b02cd82a1c
> > mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283  running
> > (5s)  2s ago   4s10.5M-  17.0.0-7183-g54142666
> >   75e3d7089cea  662c6baa097e
> > mgr.0cc47aad8ce8 0cc47aad8ce8.f00f.gridscale.dev  running
> > (65m) 8m ago  60m 553M-  17.2.6
> > 22cd8daf4d70  8145c63fdc44
> >
> > Any idea what I need to do to change that?
> 
> I want to get some things cleared up. What is the version you are
> running? I see three different ceph versions active now. I see you are
> running a podman ps command, but see docker images pulled. AFAIK podman
> needs a different IMAGE than docker ... or do you have a mixed setup?

Podman does not need different images. I think lots of CO use the docker image 
format. Afaik podman is mostly a fork of docker


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Stefan Kooman

On 14-09-2023 17:49, Boris Behrens wrote:

Hi,
I currently try to adopt our stage cluster, some hosts just pull strange
images.

root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
CONTAINER ID IMAGE COMMAND
CREATEDSTATUSPORTS NAMES
a532c37ebe42 docker.io/ceph/daemon-base:latest-master-devel -n
mgr.0cc47a6df3... 2 minutes ago Up 2 minutes ago
ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl

root@0cc47a6df330:~# ceph orch ps
NAME HOST PORTS STATUS
REFRESHED AGE MEM USE MEM LIM VERSIONIMAGE ID
CONTAINER ID
mgr.0cc47a6df14e.vqizdz 0cc47a6df14e.f00f.gridscale.dev *:9283 running
(3m) 3m ago 3m10.8M- 16.2.11
de4b0b384ad4 00b02cd82a1c
mgr.0cc47a6df330.iijety 0cc47a6df330.f00f.gridscale.dev *:9283 running
(5s) 2s ago 4s10.5M- 17.0.0-7183-g54142666
75e3d7089cea 662c6baa097e
mgr.0cc47aad8ce8 0cc47aad8ce8.f00f.gridscale.dev running
(65m) 8m ago 60m 553M- 17.2.6
22cd8daf4d70 8145c63fdc44

Any idea what I need to do to change that?

I want to get some things cleared up. What is the version you are
running? I see three different ceph versions active now. I see you are
running a podman ps command, but see docker images pulled. AFAIK podman
needs a different IMAGE than docker ... or do you have a mixed setup?

What does "ceph config-key get config/global/container_image" give you?

ceph config-key list |grep container_image should give you a list
(including config-history) where you can see what has been configured
before.

cephadm logs might give a clue as well.

You can configure the IMAGE version / type that you want by setting the
key and redeploy affected containers: For example (18.1.2):

ceph config-key set config/global/container_image
quay.io/ceph/ceph:v18.1.2@sha256:82a380c8127c42da406b7ce1281c2f3c0a86d4ba04b1f4b5f8d1036b8c24784f

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

6 matches

Site Navigation

Mail list logo

Footer information