On 1/21/21 16:51, Dan van der Ster wrote:
Hi all,

During rejoin an MDS can sometimes go OOM if the openfiles table is too large.
The workaround has been described by ceph devs as "rados rm -p
cephfs_metadata mds0_openfiles.0".

On our cluster we have several such objects for rank 0:

mds0_openfiles.0 exists with size: 199978
mds0_openfiles.1 exists with size: 153650
mds0_openfiles.2 exists with size: 40987
mds0_openfiles.3 exists with size: 7746
mds0_openfiles.4 exists with size: 413

If we suffer such an OOM, do we need to rm *all* of those objects or
only the `.0` object?

According to Patrick in thread [1]:

"You can simply delete all of the metadata pool objects
matching this format: "mds%d_openfiles.%x". No data loss will occur."

Gr. Stefan

[1]: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-October/037278.html
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to