[ceph-users] Re: MDS stuck in up:stopping state

Mark Schouten Thu, 27 May 2021 04:11:08 -0700

On Thu, May 27, 2021 at 12:38:07PM +0200, Mark Schouten wrote:
> On Thu, May 27, 2021 at 06:25:44AM +0000, Martin Rasmus Lundquist Hansen 
> wrote:
> > After scaling the number of MDS daemons down, we now have a daemon stuck in 
> > the
> > "up:stopping" state. The documentation says it can take several minutes to 
> > stop the
> > daemon, but it has been stuck in this state for almost a full day. 
> > According to
> > the "ceph fs status" output attached below, it still holds information 
> > about 2
> > inodes, which we assume is the reason why it cannot stop completely.
> > 
> > Does anyone know what we can do to finally stop it?
> 
> I have no clients, and it still does not want to stop rank1. Funny
> thing is, while trying to fix this by restarting mdses, I sometimes see
> a list of clients popping up in the dashboard, even though no clients
> are connected..


Configuring debuglogging shows me the following:
https://p.6core.net/p/rlMaunS8IM1AY5E58uUB6oy4


I have quite a lot of hardlinks on this filesystem, which I've seen
issue with 'No space left on device'. I have mds_bal_fragment_size_max
set to 200000 to mitigate that.

The message 'waiting for strays to migrate' makes me feel like I should
push the MDS to migrate them somehow .. But how?

-- 
Mark Schouten     | Tuxis B.V.
KvK: 74698818     | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: MDS stuck in up:stopping state

Reply via email to