[ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

Maarten De Quick Tue, 04 Jul 2017 11:47:21 -0700

Hi,

Background: We're having issues with our index pool (slow requests / time
outs causes crashing of an OSD and a recovery -> application issues). We
know we have very big buckets (eg. bucket of 77 million objects with only
16 shards) that need a reshard so we were looking at the resharding process.


First thing we would like to do is making a backup of the bucket index, but
this failed with:

# radosgw-admin -n client.radosgw.be-west-3 bi list
--bucket=priv-prod-up-alex > /var/backup/priv-prod-up-alex.list.backup
2017-07-03 21:28:30.325613 7f07fb8bc9c0  0 System already converted
ERROR: bi_list(): (4) Interrupted system call

When I grep for "idx" and I count these:
 # grep idx priv-prod-up-alex.list.backup | wc -l
2294942
When I do a bucket stats for that bucket I get:
# radosgw-admin -n client.radosgw.be-west-3 bucket stats
--bucket=priv-prod-up-alex | grep num_objects
2017-07-03 21:33:05.776499 7faca49b89c0  0 System already converted
            "num_objects": 20148575

It looks like there are 18 million objects missing and the backup is not
complete (not sure if that's a correct assumption?). We're also afraid that
the resharding command will face the same issue.
Has anyone seen this behaviour before or any thoughts on how to fix it?

We were also wondering if we really need the backup. As the resharding
process creates a complete new index and keeps the old bucket, is there
maybe a possibility to relink your bucket to the old bucket in case of
issues? Or am I missing something important here?

Any help would be greatly appreciated, thanks!

Regards,
Maarten

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

Reply via email to