Hi Chris
Having also recently started exploring Ceph. I too happened upon this problem.
I found that terminating the command witch ctrl-c seemed to stop the looping.
Which btw. also happens on all other mgr instances in the cluster.
Regards
Jens
-Original Message-
From: Chris Read
Sent: 11. januar 2021 21:54
To: ceph-users@ceph.io
Subject: [ceph-users] "ceph orch restart mgr" command creates mgr restart loop
Greetings all...
I'm busy testing out Ceph and have hit this troublesome bug while following the
steps outlined here:
https://docs.ceph.com/en/octopus/cephadm/monitoring/#configuring-ssl-tls-for-grafana
When I issue the "ceph orch restart mgr" command, it appears the command is not
cleared from a message queue somewhere (I'm still very unclear on many ceph
specifics), and so each time the mgr process returns from restart it picks up
the message again and keeps restarting itself forever (so far it's been stuck
in this state for 45 minutes).
Watching the logs we see this going on:
$ ceph log last cephadm -w
root@ceph-poc-000:~# ceph log last cephadm -w
cluster:
id: d23bc326-543a-11eb-bfe0-b324db228b6c
health: HEALTH_OK
services:
mon: 5 daemons, quorum
ceph-poc-000,ceph-poc-003,ceph-poc-004,ceph-poc-002,ceph-poc-001 (age 2h)
mgr: ceph-poc-000.himivo(active, since 4s), standbys:
ceph-poc-001.unjulx
osd: 10 osds: 10 up (since 2h), 10 in (since 2h)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 10 GiB used, 5.4 TiB / 5.5 TiB avail
pgs: 1 active+clean
2021-01-11T20:46:32.976606+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:46:32.980749+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:46:33.061519+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:46:39.156420+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:46:39.160618+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:46:39.242603+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:46:45.299953+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:46:45.304006+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:46:45.733495+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:46:51.871903+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:46:51.877107+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:46:51.976190+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:46:58.000720+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:46:58.006843+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:46:58.097163+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:47:04.188630+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:47:04.193501+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:47:04.285509+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
2021-01-11T20:47:10.348099+ mon.ceph-poc-000 [INF] Active manager daemon
ceph-poc-000.himivo restarted
2021-01-11T20:47:10.352340+ mon.ceph-poc-000 [INF] Activating manager
daemon ceph-poc-000.himivo
2021-01-11T20:47:10.752243+ mon.ceph-poc-000 [INF] Manager daemon
ceph-poc-000.himivo is now available
And in the logs for the mgr instance itself we see it keep replaying the
message over and over:
$ docker logs -f
ceph-d23bc326-543a-11eb-bfe0-b324db228b6c-mgr.ceph-poc-000.himivo
debug 2021-01-11T20:47:31.390+ 7f48b0d0d200 0 set uid:gid to 167:167
(ceph:ceph)
debug 2021-01-11T20:47:31.390+ 7f48b0d0d200 0 ceph version 15.2.8
(bdf3eebcd22d7d0b3dd4d5501bee5bac354d5b55) octopus (stable), process ceph-mgr,
pid 1 debug 2021-01-11T20:47:31.390+ 7f48b0d0d200 0 pidfile_write: ignore
empty --pid-file debug 2021-01-11T20:47:31.414+ 7f48b0d0d200 1 mgr[py]
Loading python module 'alerts'
debug 2021-01-11T20:47:31.486+ 7f48b0d0d200 1 mgr[py] Loading python
module 'balancer'
debug 2021-01-11T20:47:31.542+ 7f48b0d0d200 1 mgr[py] Loading python
module 'cephadm'
debug 2021-01-11T20:47:31.742+ 7f48b0d0d200 1 mgr[py] Loading python
module 'crash'
debug 2021-01-11T20:47:31.798+ 7f48b0d0d200 1 mgr[py] Loading python
module 'dashboard'
debug 2021-01-11T20:47:32.258+ 7f48b0d0d200 1 mgr[py] Loading python
module 'devicehealth'
debug 2021-01-11T20:47:32.306+ 7f48b0d0d200 1 mgr[py] Loading python
module 'diskprediction_local'
debug