[ceph-users] MDS daemons don't report any more

2023-09-09 Thread Frank Schilder
Hi all, I make a weird observation. 8 out of 12 MDS daemons seem not to report 
to the cluster any more:

# ceph fs status
con-fs2 - 1625 clients
===
RANK  STATE MDS   ACTIVITY DNSINOS  
 0active  ceph-16  Reqs:0 /s 0  0   
 1active  ceph-09  Reqs:  128 /s  4251k  4250k  
 2active  ceph-17  Reqs:0 /s 0  0   
 3active  ceph-15  Reqs:0 /s 0  0   
 4active  ceph-24  Reqs:  269 /s  3567k  3567k  
 5active  ceph-11  Reqs:0 /s 0  0   
 6active  ceph-14  Reqs:0 /s 0  0   
 7active  ceph-23  Reqs:0 /s 0  0   
POOL   TYPE USED  AVAIL  
   con-fs2-meta1 metadata  2169G  7081G  
   con-fs2-meta2   data   0   7081G  
con-fs2-data   data1248T  4441T  
con-fs2-data-ec-ssddata 705G  22.1T  
   con-fs2-data2   data3172T  4037T  
STANDBY MDS  
  ceph-08
  ceph-10
  ceph-12
  ceph-13
VERSION 
 DAEMONS  
  None  
  ceph-16, ceph-17, ceph-15, ceph-11, ceph-14, ceph-23, ceph-10, ceph-12  
ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus 
(stable)ceph-09, ceph-24, ceph-08, ceph-13  
  

Version is "none" for these and there are no stats. Ceph versions reports only 
4 MDSes out of the 12. 8 are not shown at all:

[root@gnosis ~]# ceph versions
{
"mon": {
"ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)": 5
},
"mgr": {
"ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)": 5
},
"osd": {
"ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)": 1282
},
"mds": {
"ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)": 4
},
"overall": {
"ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)": 1296
}
}

Ceph status reports everything as up and OK:

[root@gnosis ~]# ceph status
  cluster:
id: e4ece518-f2cb-4708-b00f-b6bf511e91d9
health: HEALTH_OK
 
  services:
mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 2w)
mgr: ceph-03(active, since 61s), standbys: ceph-25, ceph-01, ceph-02, 
ceph-26
mds: con-fs2:8 4 up:standby 8 up:active
osd: 1284 osds: 1282 up (since 31h), 1282 in (since 33h); 567 remapped pgs
 
  data:
pools:   14 pools, 25065 pgs
objects: 2.14G objects, 3.7 PiB
usage:   4.7 PiB used, 8.4 PiB / 13 PiB avail
pgs: 79908208/18438361040 objects misplaced (0.433%)
 23063 active+clean
 1225  active+clean+snaptrim_wait
 317   active+remapped+backfill_wait
 250   active+remapped+backfilling
 208   active+clean+snaptrim
 2 active+clean+scrubbing+deep
 
  io:
client:   596 MiB/s rd, 717 MiB/s wr, 4.16k op/s rd, 3.04k op/s wr
recovery: 8.7 GiB/s, 3.41k objects/s

My first thought is that the status module failed. However, I don't manage to 
restart it (always on). An MGR fail-over did not help.

Any ideas what is going on here?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-09 Thread Anthony D'Atri
That may be the very one I was thinking of, though the OP seemed to be 
preserving the IP addresses, so I suspect containerization is in play.

> On Sep 9, 2023, at 11:36 AM, Tyler Stachecki  
> wrote:
> 
> On Sat, Sep 9, 2023 at 10:48 AM Anthony D'Atri  
> wrote:
>> There was also at point an issue where clients wouldn’t get a runtime update 
>> of new mons.
> 
> There's also 8+ year old unresolved bugs like this in OpenStack Cinder
> that will bite you if the relocated mons have new IP addresses:
> https://bugs.launchpad.net/nova/+bug/1452641
> 
> Tripling down on what others have said: would advise against
> redeploying mons unless you need to...
> 
> FYI: you can relocate the OSDs without having Ceph spew bits about by
> setting noout, stopping the OSDs to be moved, physically moving the
> underlying drive(s) to another host, running `ceph-volume lvm activate
> --all` on the new host, and unsetting noout.
> 
> Regards,
> Tyler
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-09 Thread Tyler Stachecki
On Sat, Sep 9, 2023 at 10:48 AM Anthony D'Atri  wrote:
> There was also at point an issue where clients wouldn’t get a runtime update 
> of new mons.

There's also 8+ year old unresolved bugs like this in OpenStack Cinder
that will bite you if the relocated mons have new IP addresses:
https://bugs.launchpad.net/nova/+bug/1452641

Tripling down on what others have said: would advise against
redeploying mons unless you need to...

FYI: you can relocate the OSDs without having Ceph spew bits about by
setting noout, stopping the OSDs to be moved, physically moving the
underlying drive(s) to another host, running `ceph-volume lvm activate
--all` on the new host, and unsetting noout.

Regards,
Tyler
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-09 Thread Anthony D'Atri
Which Ceph release are you running, and how was it deployed?

With some older releases I experienced mons behaving unexpectedly when one of 
the quorum bounced, so I like to segregate them for isolation still.  

There was also at point an issue where clients wouldn’t get a runtime update of 
new mons.   

I endorse Eugen’s strategy, but must ask first the server and client releases 
involved.  Especially since you wrote “old”.  

> On Sep 9, 2023, at 5:28 AM, Eugen Block  wrote:
> 
> Hi,
> 
> is it an actual requirement to redeploy MONs? Because almost all clusters we 
> support or assist with have MONs and OSDs colocated. MON daemons are quite 
> light-weight services, so if it's not really necessary, I'd leave it as it is.
> If you really need to move the MONs to different servers, I'd recommend to 
> add the new MONs one by one. Your monmap will then contain old and new MONs, 
> and when all new MONs (with new IPs) are up and running you can remove the 
> old MON daemons. There's no need to switch off OSDs or drain a host. You can 
> find more information in the Nautilus docs [1] where the orchestrator wasn't 
> available yet.
> 
> Regards,
> Eugen
> 
> [1] https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-mons/
> 
> Zitat von Ramin Najjarbashi :
> 
>> Hi
>> 
>> I
>> am writing to seek guidance and best practices for a maintenance operation
>> in my Ceph cluster. I have an older cluster in which the Monitors (Mons)
>> and Object Storage Devices (OSDs) are currently deployed on the same host.
>> I am interested in separating them while ensuring zero downtime and
>> minimizing risks to the cluster's stability.
>> 
>> The primary goal is to deploy new Monitors on different servers without
>> causing service interruptions or disruptions to data availability.
>> 
>> The challenge arises because updating the configuration to add new Monitors
>> typically requires a restart of all OSDs, which is less than ideal in terms
>> of maintaining cluster availability.
>> 
>> One approach I considered is to reweight all OSDs on the host to zero,
>> allowing data to gradually transfer to other OSDs. Once all data has been
>> safely migrated, I would proceed to remove the old OSDs. Afterward, I would
>> deploy the new Monitors on a different server with the previous IP
>> addresses and deploy the OSDs on the old Monitors' host with new IP
>> addresses.
>> 
>> While this approach seems to minimize risks, it can be time-consuming and
>> may not be the most efficient way to achieve the desired separation.
>> 
>> I would greatly appreciate the community's insights and suggestions on the
>> best approach to achieve this separation of Mons and OSDs with zero
>> downtime and minimal risk. If there are alternative methods or best
>> practices that can be recommended, please share your expertise.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Best practices regarding MDS node restart

2023-09-09 Thread Alexander E. Patrakov
Hello,

I am interested in the best-practice guidance for the following situation.

There is a Ceph cluster with CephFS deployed. There are three servers
dedicated to running MDS daemons: one active, one standby-replay, and one
standby. There is only a single rank.

Sometimes, servers need to be rebooted for reasons unrelated to Ceph.
What's the proper procedure to follow when restarting a node that currently
contains an active MDS server? The goal is to minimize the client downtime.
Ideally, they should not notice even if they play MP3s from the CephFS
filesystem (note that I haven't tested this exact scenario) - is this
achievable?

I tried to use the "ceph mds fail mds02" command while mds02 was active and
mds03 was standby-replay, to force the fail-over to mds03 so that I could
reboot mds02. Result: mds02 became standby, while mds03 went through
reconnect (30 seconds), rejoin (another 30 seconds), and replay (5 seconds)
phases. During the "reconnect" and "rejoin" phases, the "Activity" column
of "ceph fs status" is empty, which concerns me. It looks like I just
caused a 65-second downtime. After all of that, mds02 became
standby-replay, as expected.

Is there a better way? Or, should I have rebooted mds02 without much
thinking?

-- 
Alexander E. Patrakov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-09 Thread Eugen Block

Hi,

is it an actual requirement to redeploy MONs? Because almost all  
clusters we support or assist with have MONs and OSDs colocated. MON  
daemons are quite light-weight services, so if it's not really  
necessary, I'd leave it as it is.
If you really need to move the MONs to different servers, I'd  
recommend to add the new MONs one by one. Your monmap will then  
contain old and new MONs, and when all new MONs (with new IPs) are up  
and running you can remove the old MON daemons. There's no need to  
switch off OSDs or drain a host. You can find more information in the  
Nautilus docs [1] where the orchestrator wasn't available yet.


Regards,
Eugen

[1] https://docs.ceph.com/en/nautilus/rados/operations/add-or-rm-mons/

Zitat von Ramin Najjarbashi :


Hi

I
am writing to seek guidance and best practices for a maintenance operation
in my Ceph cluster. I have an older cluster in which the Monitors (Mons)
and Object Storage Devices (OSDs) are currently deployed on the same host.
I am interested in separating them while ensuring zero downtime and
minimizing risks to the cluster's stability.

The primary goal is to deploy new Monitors on different servers without
causing service interruptions or disruptions to data availability.

The challenge arises because updating the configuration to add new Monitors
typically requires a restart of all OSDs, which is less than ideal in terms
of maintaining cluster availability.

One approach I considered is to reweight all OSDs on the host to zero,
allowing data to gradually transfer to other OSDs. Once all data has been
safely migrated, I would proceed to remove the old OSDs. Afterward, I would
deploy the new Monitors on a different server with the previous IP
addresses and deploy the OSDs on the old Monitors' host with new IP
addresses.

While this approach seems to minimize risks, it can be time-consuming and
may not be the most efficient way to achieve the desired separation.

I would greatly appreciate the community's insights and suggestions on the
best approach to achieve this separation of Mons and OSDs with zero
downtime and minimal risk. If there are alternative methods or best
practices that can be recommended, please share your expertise.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Separating Mons and OSDs in Ceph Cluster

2023-09-09 Thread Ramin Najjarbashi
Hi

I
am writing to seek guidance and best practices for a maintenance operation
in my Ceph cluster. I have an older cluster in which the Monitors (Mons)
and Object Storage Devices (OSDs) are currently deployed on the same host.
I am interested in separating them while ensuring zero downtime and
minimizing risks to the cluster's stability.

The primary goal is to deploy new Monitors on different servers without
causing service interruptions or disruptions to data availability.

The challenge arises because updating the configuration to add new Monitors
typically requires a restart of all OSDs, which is less than ideal in terms
of maintaining cluster availability.

One approach I considered is to reweight all OSDs on the host to zero,
allowing data to gradually transfer to other OSDs. Once all data has been
safely migrated, I would proceed to remove the old OSDs. Afterward, I would
deploy the new Monitors on a different server with the previous IP
addresses and deploy the OSDs on the old Monitors' host with new IP
addresses.

While this approach seems to minimize risks, it can be time-consuming and
may not be the most efficient way to achieve the desired separation.

I would greatly appreciate the community's insights and suggestions on the
best approach to achieve this separation of Mons and OSDs with zero
downtime and minimal risk. If there are alternative methods or best
practices that can be recommended, please share your expertise.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io