You could try pausing the upgrade and manually "upgrading" the mds daemons
by redeploying them on the new image. Something like "ceph orch daemon
redeploy --image <17.2.6 image>" (daemon names should
match those in "ceph orch ps" output). If you do that for all of them and
then get them into an up
Will also note that the normal upgrade process scales down the mds service
to have only 1 mds per fs before upgrading it, so maybe something you'd
want to do as well if the upgrade didn't do it already. It does so by
setting the max_mds to 1 for the fs.
On Mon, Apr 10, 2023 at 3:51 PM Adam King w
I did what you told me.
I also see in the log, that the command went through:
2023-04-10T19:58:46.522477+ mgr.ceph04.qaexpv [INF] Schedule
redeploy daemon mds.mds01.ceph06.rrxmks
2023-04-10T20:01:03.360559+ mgr.ceph04.qaexpv [INF] Schedule
redeploy daemon mds.mds01.ceph05.pqxmvt
2023-0
It seems like it maybe didn't actually do the redeploy as it should log
something saying it's actually doing it on top of the line saying it
scheduled it. To confirm, the upgrade is paused ("ceph orch upgrade status"
reports is_paused as false)? If so, maybe try doing a mgr failover ("ceph
mgr fail
On 4/11/23 03:24, Thomas Widhalm wrote:
Hi,
If you remember, I hit bug https://tracker.ceph.com/issues/58489 so I
was very relieved when 17.2.6 was released and started to update
immediately.
Please note, this fix is not in the v17.2.6 yet in upstream code.
Thanks
- Xiubo
But now I'm s
On 11.04.23 09:16, Xiubo Li wrote:
On 4/11/23 03:24, Thomas Widhalm wrote:
Hi,
If you remember, I hit bug https://tracker.ceph.com/issues/58489 so I
was very relieved when 17.2.6 was released and started to update
immediately.
Please note, this fix is not in the v17.2.6 yet in upstream
On 4/11/23 15:59, Thomas Widhalm wrote:
On 11.04.23 09:16, Xiubo Li wrote:
On 4/11/23 03:24, Thomas Widhalm wrote:
Hi,
If you remember, I hit bug https://tracker.ceph.com/issues/58489 so
I was very relieved when 17.2.6 was released and started to update
immediately.
Please note, this
Thanks for your detailed explanations! That helped a lot.
All MDS are still in status error. "ceph orch device ls" showed that
some hosts seem to not have enough space on devices. I wonder why I
didn't see that in monitoring. Anyway, I'll fix that and then try to
proceed.
When the backport i
Sorry - the info about the insufficient space seems like it referred to
why the devices are not available. So that's just as is should be.
All MDS are still in error state and were refreshed 2d ago. Even right
after a mgr failover. So it seems, there's something else going on.
One thing that