Hi there, newcomer here.

I've been trying to figure out if it's possible to repair or recover cephfs 
after some unfortunate issues a couple of months ago; these couple of nodes 
have been offline most of the time since the incident.

I'm sure the problem is that I lack the ceph expertise to quite sus out where 
the broken bits are. This was a 2-node cluster (I know I know) that had a 
hypervisor primary disk fail, and the entire OS was lost. I reinstalled the 
hypervisor, rejoined it to the cluster (proxmox), rejoined ceph to the other 
node, re-added the OSDs. It came back with quorum problems and some PGs were 
inconsistent and some were lost. Some of that is due to my own fiddling around, 
which possibly exacerbated things. Eventually I had to edit the monmap down to 
1 monitor, which had all kinds of screwy journal issues...it's been a while 
since I've tried resuscitating this, so the details in my memory are fuzzy.

My cluster health isn't awful. Output is basically this:
```
root@pve02:~# ceph -s
  cluster:
    id:     8b31840b-5706-4c92-8135-0d6e03976af1
    health: HEALTH_ERR
            1 filesystem is degraded
            1 filesystem is offline
            1 mds daemon damaged
            noout flag(s) set
            16 daemons have recently crashed

  services:
    mon: 1 daemons, quorum pve02 (age 3d)
    mgr: pve01(active, since 4d)
    mds: 0/1 daemons up
    osd: 7 osds: 7 up (since 2d), 7 in (since 7w)
         flags noout

  data:
   volumes: 0/1 healthy, 1 recovering; 1 damaged
    pools:   5 pools, 576 pgs
    objects: 1.51M objects, 4.0 TiB
    usage:   8.2 TiB used, 9.1 TiB / 17 TiB avail
    pgs:     575 active+clean
             1   active+clean+scrubbing+deep

  io:
    client:   241 KiB/s wr, 0 op/s rd, 10 op/s wr
```

I've tried a couple times running down the steps in here 
(https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/), but I 
always hit an error at scan_links, where I get a crash dump of sorts. If I try 
and mark the cephfs as repaired/joinable, MDS daemons will try and replay and 
then fail. The only occurrences of err/ERR in the MDS logs are a line like this:
```
2022-05-07T18:31:26.342-0500 7f22b44d8700  1 mds.0.94  waiting for osdmap 
301772 (which blocklists prior instance)
2022-05-07T18:31:26.346-0500 7f22adccb700 -1 log_channel(cluster) log [ERR] : 
failed to read JournalPointer: -1 ((1) Operation not permitted)
2022-05-07T18:31:26.346-0500 7f22af4ce700  0 mds.0.journaler.pq(ro) error 
getting journal off disk
```

I haven't had much luck on the googles with diagnosing that error; seems 
uncommon. My hope is that the cephfs_data pool is fine. I actually never had 
any inconsistent PG issues on a pool other than the metadata pool, so that's 
the only one that suffered actual acute injury during the hardware 
failure/quorum loss.
If I had more experience with the rados tools, I'd probably be more helpful. I 
have plenty of logs lying about and can perform any diagnoses that might help, 
but I hate to spam too much here right out of the gate.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to