[ceph-users] MDS daemons don't report any more
Hi all, I make a weird observation. 8 out of 12 MDS daemons seem not to report to the cluster any more: # ceph fs status con-fs2 - 1625 clients === RANK STATE MDS ACTIVITY DNSINOS 0active ceph-16 Reqs:0 /s 0 0 1active ceph-09 Reqs: 128 /s 4251k 4250k 2active ceph-17 Reqs:0 /s 0 0 3active ceph-15 Reqs:0 /s 0 0 4active ceph-24 Reqs: 269 /s 3567k 3567k 5active ceph-11 Reqs:0 /s 0 0 6active ceph-14 Reqs:0 /s 0 0 7active ceph-23 Reqs:0 /s 0 0 POOL TYPE USED AVAIL con-fs2-meta1 metadata 2169G 7081G con-fs2-meta2 data 0 7081G con-fs2-data data1248T 4441T con-fs2-data-ec-ssddata 705G 22.1T con-fs2-data2 data3172T 4037T STANDBY MDS ceph-08 ceph-10 ceph-12 ceph-13 VERSION DAEMONS None ceph-16, ceph-17, ceph-15, ceph-11, ceph-14, ceph-23, ceph-10, ceph-12 ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)ceph-09, ceph-24, ceph-08, ceph-13 Version is "none" for these and there are no stats. Ceph versions reports only 4 MDSes out of the 12. 8 are not shown at all: [root@gnosis ~]# ceph versions { "mon": { "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 5 }, "mgr": { "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 5 }, "osd": { "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 1282 }, "mds": { "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 4 }, "overall": { "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 1296 } } Ceph status reports everything as up and OK: [root@gnosis ~]# ceph status cluster: id: e4ece518-f2cb-4708-b00f-b6bf511e91d9 health: HEALTH_OK services: mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 2w) mgr: ceph-03(active, since 61s), standbys: ceph-25, ceph-01, ceph-02, ceph-26 mds: con-fs2:8 4 up:standby 8 up:active osd: 1284 osds: 1282 up (since 31h), 1282 in (since 33h); 567 remapped pgs data: pools: 14 pools, 25065 pgs objects: 2.14G objects, 3.7 PiB usage: 4.7 PiB used, 8.4 PiB / 13 PiB avail pgs: 79908208/18438361040 objects misplaced (0.433%) 23063 active+clean 1225 active+clean+snaptrim_wait 317 active+remapped+backfill_wait 250 active+remapped+backfilling 208 active+clean+snaptrim 2 active+clean+scrubbing+deep io: client: 596 MiB/s rd, 717 MiB/s wr, 4.16k op/s rd, 3.04k op/s wr recovery: 8.7 GiB/s, 3.41k objects/s My first thought is that the status module failed. However, I don't manage to restart it (always on). An MGR fail-over did not help. Any ideas what is going on here? Thanks and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster
That may be the very one I was thinking of, though the OP seemed to be preserving the IP addresses, so I suspect containerization is in play. > On Sep 9, 2023, at 11:36 AM, Tyler Stachecki > wrote: > > On Sat, Sep 9, 2023 at 10:48 AM Anthony D'Atri > wrote: >> There was also at point an issue where clients wouldn’t get a runtime update >> of new mons. > > There's also 8+ year old unresolved bugs like this in OpenStack Cinder > that will bite you if the relocated mons have new IP addresses: > https://bugs.launchpad.net/nova/+bug/1452641 > > Tripling down on what others have said: would advise against > redeploying mons unless you need to... > > FYI: you can relocate the OSDs without having Ceph spew bits about by > setting noout, stopping the OSDs to be moved, physically moving the > underlying drive(s) to another host, running `ceph-volume lvm activate > --all` on the new host, and unsetting noout. > > Regards, > Tyler > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster
On Sat, Sep 9, 2023 at 10:48 AM Anthony D'Atri wrote: > There was also at point an issue where clients wouldn’t get a runtime update > of new mons. There's also 8+ year old unresolved bugs like this in OpenStack Cinder that will bite you if the relocated mons have new IP addresses: https://bugs.launchpad.net/nova/+bug/1452641 Tripling down on what others have said: would advise against redeploying mons unless you need to... FYI: you can relocate the OSDs without having Ceph spew bits about by setting noout, stopping the OSDs to be moved, physically moving the underlying drive(s) to another host, running `ceph-volume lvm activate --all` on the new host, and unsetting noout. Regards, Tyler ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster
Which Ceph release are you running, and how was it deployed? With some older releases I experienced mons behaving unexpectedly when one of the quorum bounced, so I like to segregate them for isolation still. There was also at point an issue where clients wouldn’t get a runtime update of new mons. I endorse Eugen’s strategy, but must ask first the server and client releases involved. Especially since you wrote “old”. > On Sep 9, 2023, at 5:28 AM, Eugen Block wrote: > > Hi, > > is it an actual requirement to redeploy MONs? Because almost all clusters we > support or assist with have MONs and OSDs colocated. MON daemons are quite > light-weight services, so if it's not really necessary, I'd leave it as it is. > If you really need to move the MONs to different servers, I'd recommend to > add the new MONs one by one. Your monmap will then contain old and new MONs, > and when all new MONs (with new IPs) are up and running you can remove the > old MON daemons. There's no need to switch off OSDs or drain a host. You can > find more information in the Nautilus docs [1] where the orchestrator wasn't > available yet. > > Regards, > Eugen > > [1] https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-mons/ > > Zitat von Ramin Najjarbashi : > >> Hi >> >> I >> am writing to seek guidance and best practices for a maintenance operation >> in my Ceph cluster. I have an older cluster in which the Monitors (Mons) >> and Object Storage Devices (OSDs) are currently deployed on the same host. >> I am interested in separating them while ensuring zero downtime and >> minimizing risks to the cluster's stability. >> >> The primary goal is to deploy new Monitors on different servers without >> causing service interruptions or disruptions to data availability. >> >> The challenge arises because updating the configuration to add new Monitors >> typically requires a restart of all OSDs, which is less than ideal in terms >> of maintaining cluster availability. >> >> One approach I considered is to reweight all OSDs on the host to zero, >> allowing data to gradually transfer to other OSDs. Once all data has been >> safely migrated, I would proceed to remove the old OSDs. Afterward, I would >> deploy the new Monitors on a different server with the previous IP >> addresses and deploy the OSDs on the old Monitors' host with new IP >> addresses. >> >> While this approach seems to minimize risks, it can be time-consuming and >> may not be the most efficient way to achieve the desired separation. >> >> I would greatly appreciate the community's insights and suggestions on the >> best approach to achieve this separation of Mons and OSDs with zero >> downtime and minimal risk. If there are alternative methods or best >> practices that can be recommended, please share your expertise. >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Best practices regarding MDS node restart
Hello, I am interested in the best-practice guidance for the following situation. There is a Ceph cluster with CephFS deployed. There are three servers dedicated to running MDS daemons: one active, one standby-replay, and one standby. There is only a single rank. Sometimes, servers need to be rebooted for reasons unrelated to Ceph. What's the proper procedure to follow when restarting a node that currently contains an active MDS server? The goal is to minimize the client downtime. Ideally, they should not notice even if they play MP3s from the CephFS filesystem (note that I haven't tested this exact scenario) - is this achievable? I tried to use the "ceph mds fail mds02" command while mds02 was active and mds03 was standby-replay, to force the fail-over to mds03 so that I could reboot mds02. Result: mds02 became standby, while mds03 went through reconnect (30 seconds), rejoin (another 30 seconds), and replay (5 seconds) phases. During the "reconnect" and "rejoin" phases, the "Activity" column of "ceph fs status" is empty, which concerns me. It looks like I just caused a 65-second downtime. After all of that, mds02 became standby-replay, as expected. Is there a better way? Or, should I have rebooted mds02 without much thinking? -- Alexander E. Patrakov ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster
Hi, is it an actual requirement to redeploy MONs? Because almost all clusters we support or assist with have MONs and OSDs colocated. MON daemons are quite light-weight services, so if it's not really necessary, I'd leave it as it is. If you really need to move the MONs to different servers, I'd recommend to add the new MONs one by one. Your monmap will then contain old and new MONs, and when all new MONs (with new IPs) are up and running you can remove the old MON daemons. There's no need to switch off OSDs or drain a host. You can find more information in the Nautilus docs [1] where the orchestrator wasn't available yet. Regards, Eugen [1] https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-mons/ Zitat von Ramin Najjarbashi : Hi I am writing to seek guidance and best practices for a maintenance operation in my Ceph cluster. I have an older cluster in which the Monitors (Mons) and Object Storage Devices (OSDs) are currently deployed on the same host. I am interested in separating them while ensuring zero downtime and minimizing risks to the cluster's stability. The primary goal is to deploy new Monitors on different servers without causing service interruptions or disruptions to data availability. The challenge arises because updating the configuration to add new Monitors typically requires a restart of all OSDs, which is less than ideal in terms of maintaining cluster availability. One approach I considered is to reweight all OSDs on the host to zero, allowing data to gradually transfer to other OSDs. Once all data has been safely migrated, I would proceed to remove the old OSDs. Afterward, I would deploy the new Monitors on a different server with the previous IP addresses and deploy the OSDs on the old Monitors' host with new IP addresses. While this approach seems to minimize risks, it can be time-consuming and may not be the most efficient way to achieve the desired separation. I would greatly appreciate the community's insights and suggestions on the best approach to achieve this separation of Mons and OSDs with zero downtime and minimal risk. If there are alternative methods or best practices that can be recommended, please share your expertise. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Separating Mons and OSDs in Ceph Cluster
Hi I am writing to seek guidance and best practices for a maintenance operation in my Ceph cluster. I have an older cluster in which the Monitors (Mons) and Object Storage Devices (OSDs) are currently deployed on the same host. I am interested in separating them while ensuring zero downtime and minimizing risks to the cluster's stability. The primary goal is to deploy new Monitors on different servers without causing service interruptions or disruptions to data availability. The challenge arises because updating the configuration to add new Monitors typically requires a restart of all OSDs, which is less than ideal in terms of maintaining cluster availability. One approach I considered is to reweight all OSDs on the host to zero, allowing data to gradually transfer to other OSDs. Once all data has been safely migrated, I would proceed to remove the old OSDs. Afterward, I would deploy the new Monitors on a different server with the previous IP addresses and deploy the OSDs on the old Monitors' host with new IP addresses. While this approach seems to minimize risks, it can be time-consuming and may not be the most efficient way to achieve the desired separation. I would greatly appreciate the community's insights and suggestions on the best approach to achieve this separation of Mons and OSDs with zero downtime and minimal risk. If there are alternative methods or best practices that can be recommended, please share your expertise. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io