[ceph-users] cephfs-data-scan safety on active filesystem

Ryan Leimenstoll Mon, 07 May 2018 12:51:12 -0700

Hi All, 

We recently experienced a failure with our 12.2.4 cluster running a CephFS 
instance that resulted in some data loss due to a seemingly problematic OSD 
blocking IO on its PGs. We restarted the (single active) mds daemon during 
this, which caused damage due to the journal not having the chance to flush 
back. We reset the journal, session table, and fs to bring the filesystem 
online. We then removed some directories/inodes that were causing the cluster 
to report damaged metadata (and were otherwise visibly broken by navigating the 
filesystem).


With that, there are now some paths that seem to have been orphaned (which we 
expected). We did not run the ‘cephfs-data-scan’ tool [0] in the name of 
getting the system back online ASAP. Now that the filesystem is otherwise 
stable, can we initiate a scan_links operation with the mds active safely?

[0] 
http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/#recovery-from-missing-metadata-objects

Thanks much, 
Ryan Leimenstoll
rleim...@umiacs.umd <mailto:rleim...@umiacs.umd>.edu
University of Maryland Institute for Advanced Computer Studies

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] cephfs-data-scan safety on active filesystem

Reply via email to