Hi,
I just migrated to cephadm on my 2 node octopus cluster.
I have the same problems with the mds started in a container
not being available to ceph. Had to run the old systemd mds, to
keep the fs available.
some outputs:
Just tried it, stopped all mds nodes and created one using orch. Result: 0/1
daemons up (1 failed), 1 standby. Same as before, and logs don’t show any
errors as well.
I’ll probably try upgrading the orch-based setup to 16.2.6 over the weekend to
match the exact non-dockerized MDS version,
By saying upgrade, I mean upgrade from the non-dockerized 16.2.5 to cephadm
version 16.2.6. So I think you need to disable standby-replay and reduce the
number of ranks to 1, then stop all the non-dockerized mds, deploy new mds with
cephadm. Only scaling back up after you finish the migration.
Hi Weiwen,
Yes, we did that during the upgrade. In fact, we did that multiple times even
after the upgrade to see if it will resolve the issue (disabling hot standby,
scaling everything down to a single MDS, swapping it with the new one, scaling
back up).
The upgrade itself went fine,