Hi,

We've got an outstanding issue with one of our Ceph clusters here at RAL. The 
cluster is 'Echo', our 40PB cluster. We found an object from an 8+3EC RGW pool 
in the failed_repair state. We aren't sure how the object got into this state, 
but it doesn't appear to be a case of correlated drive failure (the rest of the 
PG is fine). However, the detail of how we got into this state isn't our focus, 
it's how to get the PG back to a clean state.

The object (for our purposes, named OBJNAME) in question is from a RadosGW data 
pool. It presented initially as a PG in the failed_repair state. Repeated 
attempts to get the PG to repair failed. At this point we contacted the user 
who owns the data, and determined that the data in question was also stored 
elsewhere and so we could safely delete the object. We did that using 
radosgw-admin object rm OBJNAME, and confirmed that the object is gone with 
various approaches (radosgw-admin object stat, rados ls --pgid PGID | grep 
OBJNAME).

So far, so good. Except, even after the object was deleted and in spite of many 
instructions to repair, the placement group is still in the state 
active+clean+inconsistent+failed_repair, and the cluster won't go to HEALTH_OK. 
Here's what the log from one of these repair attempts looks like (from the log 
on the primary OSD).

2022-05-08 16:23:43.898 7f79d3872700  0 log_channel(cluster) log [DBG] : 11.2b5 
repair starts
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 1899(8) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 1911(7) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 2842(10) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 3256(6) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 3399(5) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 3770(9) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 5206(3) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
shard 6047(4) soid 11:ad45a433:::OBJNAME:head : candidate had an ec size 
mismatch
2022-05-08 16:51:38.807 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
soid 11:ad45a433:::OBJNAME:head : failed to pick suitable object info
2022-05-08 19:03:12.690 7f79d3872700 -1 log_channel(cluster) log [ERR] : 11.2b5 
repair 11 errors, 0 fixed

Looking for inconsistent objects in the PG doesn't report anything odd about 
this object (right now we get this rather odd output, but aren't sure that this 
isn't a red herring).

[root@ceph-adm1 ~]# rados list-inconsistent-obj 11.2b5
No scrub information available for pg 11.2b5
error 2: (2) No such file or directory

We don't get this output from this command on any other PG that we've tried.

So what next? To reiterate, this isn't about data recovery, it's about getting 
the cluster back to a healthy state. I should also note that this issue doesn't 
seem to be impacting the cluster beyond making that PG show as being in a bad 
state.

Rob Appleyard


This email and any attachments are intended solely for the use of the named 
recipients. If you are not the intended recipient you must not use, disclose, 
copy or distribute this email or any of its attachments and should notify the 
sender immediately and delete this email from your system. UK Research and 
Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
this email or any attachments containing viruses or malware but the recipient 
should carry out its own virus and malware checks before opening the 
attachments. UKRI does not accept any liability for any losses or damages which 
the recipient may sustain due to presence of any viruses.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to