[ceph-users] CEPH MDS Damaged Metadata - recovery steps

James Wilkins Sun, 02 Jun 2019 23:58:07 -0700

Hi all,

After a bit of advice to ensure we’re approaching this the right way.


(version: 12.2.12, multi-mds, dirfrag is enabled)
 
We have corrupt meta-data as identified by ceph
 
    health: HEALTH_ERR
            2 MDSs report damaged metadata
 
Asking the mds via damage ls
 
    {
        "damage_type": "dir_frag",
        "id": 2265410500,
        "ino": 2199349051809,
        "frag": "*",
        "path": 
"/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html/shop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1"
    }
 
 
We’ve done the steps outlined here -> 
http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/ namely
 
cephfs-journal-tool –fs:all journal reset (both ranks)
cephfs-data-scan scan extents / inodes / links has completed
 
However when attempting to access the named folder we get:

2019-05-31 03:16:04.792274 7f56f6fb5700 -1 log_channel(cluster) log [ERR] : dir 
0x200136b41a1 object missing on disk; some files may be lost 
(/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html/shop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1)
 
We get this error followed shortly by an MDS failover
 
Two questions really
 
What’s not immediately clear from the documentation is should we/do we also 
need to run the below?
 
# Session table
cephfs-table-tool 0 reset session
# SnapServer
cephfs-table-tool 0 reset snap
# InoTable
cephfs-table-tool 0 reset inode
# Root inodes ("/" and MDS directory)
cephfs-data-scan init
 
And secondly – our current train of thought is we need to grab the inode number 
of the parent folder and delete this from the metadata pool via rados rmomapkey 
– is this correct?
 
Any input appreciated
 
Cheers,
 

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CEPH MDS Damaged Metadata - recovery steps

Reply via email to