Hi Venky,

thank you for your help. We managed to shut down mds.1:
We set "ceph fs set max_mds 1" and waited for about 30 minutes. In the first 
couple minutes, strays were migrated from mds.1 to mds.0. After this, the stray 
export hung. The mds.1 remained in the state_stopping. After about 30 minutes, 
we restarted mds.1. This resulted in one active mds and two standby mds. 
However, we are not sure, if the remaining strays could be migrated.

When we had a closer look at the perf counter of the mds, we realized that the 
number of strays_enqueued is quite high and constantly increasing. Is this to 
be expected? What does the counter "strays_enqueued" mean in detail?

ceph daemon  mds.0 perf dump | grep stray
        "num_strays": 49846,
        "num_strays_delayed": 21,
        "num_strays_enqueuing": 0,
        "strays_created": 2042124,
        "strays_enqueued": 2396076,
        "strays_reintegrated": 44207,
        "strays_migrated": 38,

Would it be safe to perform "ceph orch upgrade resume" at this point? At the 
moment, the MONs and OSDs are running 17.2.6, while the MDSs and RGWs are 
running 17.2.5. So we have to upgrade the MDS and RGW eventually.

Best, Tobias
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to