Long running cluster, currently running 14.2.6

I have a certain user whose buckets have become corrupted in that the
following commands:

radosgw-admin bucket check --bucket <bucket>
radosgw-admin bucket list --bucket= <bucket>

return with the following:
ERROR: could not init bucket: (2) No such file or directory
2020-08-04 13:47:03.417 7f94dfea86c0 -1 ERROR: get_bucket_instance_from_oid
failed: -2

radosgw-admin metadata get bucket:<bucket>
is successful.

radosgw-admin metadata get bucket.instance:<bucket>:<bucket_id>
yields: ERROR: can't get key: (2) No such file or directory

radosgw-admin metadata list bucket.instance | grep -i <bucket>
yields no results.

When I drop to rados and look in the index pool I can see 128 objects
matching the bucket_id as derived from the "metadata get" and this seems to
match other functioning buckets.

Unfortunately this issue was sleepy and happened many months ago unnoticed.
We have not retained many of the ceph logs from this time. We do have the
civetweb access logs and have found that error codes began on the same day
that we lowered the pg_num on many of the rgw pools (all of them but the
index_pool and the data_pool). OSDs were filestore at that time and have
since been converted to bluestore. Other than the dates lining up we have
no direct evidence these are related, and did not encounter any
inconsistent PGs. We also used this process on other clusters with no ill
effects.

Ideally I would like to repair and restore the functionality of these
buckets given that it appears the objects in the index pool still exist. Is
there any way to repair these? Do these errors correlate to any known
issues? Thanks in advance for any leads.


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to