[ceph-users] Re: mds openfiles table shards

2021-10-06 Thread Stefan Kooman
On 1/21/21 16:51, Dan van der Ster wrote: Hi all, During rejoin an MDS can sometimes go OOM if the openfiles table is too large. The workaround has been described by ceph devs as "rados rm -p cephfs_metadata mds0_openfiles.0". On our cluster we have several such objects for rank 0: mds0_openfi

[ceph-users] Re: mds openfiles table shards

2021-01-22 Thread Dan van der Ster
Just to follow up with an anecdote -- I had asked the question because we had to do a planned failover of one of our MDSs. The intervention went well and we didn't need to remove the openfiles table objects. We stopped the active mds.0 then the standby took over -- the rejoin step took around 5 mi