Hi All, 

We recently experienced a failure with our 12.2.4 cluster running a CephFS 
instance that resulted in some data loss due to a seemingly problematic OSD 
blocking IO on its PGs. We restarted the (single active) mds daemon during 
this, which caused damage due to the journal not having the chance to flush 
back. We reset the journal, session table, and fs to bring the filesystem 
online. We then removed some directories/inodes that were causing the cluster 
to report damaged metadata (and were otherwise visibly broken by navigating the 
filesystem).

With that, there are now some paths that seem to have been orphaned (which we 
expected). We did not run the ‘cephfs-data-scan’ tool [0] in the name of 
getting the system back online ASAP. Now that the filesystem is otherwise 
stable, can we initiate a scan_links operation with the mds active safely?

[0] 
http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/#recovery-from-missing-metadata-objects

Thanks much, 
Ryan Leimenstoll
rleim...@umiacs.umd <mailto:rleim...@umiacs.umd>.edu
University of Maryland Institute for Advanced Computer Studies


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to