Re: [ceph-users] Cleaning Up Failed Multipart Uploads

Tyler Bishop Tue, 02 Aug 2016 07:01:01 -0700

We're having the same issues. I have a 1200TB pool at 90% utilization however 
disk utilization is only 40%






        

Tyler Bishop 
Chief Technical Officer 
513-299-7108 x10 



tyler.bis...@beyondhosting.net 


If you are not the intended recipient of this transmission you are notified 
that disclosing, copying, distributing or taking any action in reliance on the 
contents of this information is strictly prohibited. 




From: "Brian Felton" <bjfel...@gmail.com> 
To: "ceph-users" <ceph-us...@ceph.com> 
Sent: Wednesday, July 27, 2016 9:24:30 AM 
Subject: [ceph-users] Cleaning Up Failed Multipart Uploads 

Greetings, 

Background: If an object storage client re-uploads parts to a multipart object, 
RadosGW does not clean up all of the parts properly when the multipart upload 
is aborted or completed. You can read all of the gory details (including 
reproduction steps) in this bug report: http://tracker.ceph.com/issues/16767 . 

My setup: Hammer 0.94.6 cluster only used for S3-compatible object storage. RGW 
stripe size is 4MiB. 

My problem: I have buckets that are reporting TB more utilization (and, in one 
case, 200k more objects) than they should report. I am trying to remove the 
detritus from the multipart uploads, but removing the leftover parts directly 
from the .rgw.buckets pool is having no effect on bucket utilization (i.e. 
neither the object count nor the space used are declining). 

To give an example, I have a client that uploaded a very large multipart object 
(8000 15MiB parts). Due to a bug in the client, it uploaded each of the 8000 
parts 6 times. After the sixth attempt, it gave up and aborted the upload, at 
which point RGW removed the 8000 parts from the sixth attempt. When I list the 
bucket's contents with radosgw-admin (radosgw-admin bucket list 
--bucket=<bucket> --max-entries=<size of bucket>), I see all of the object's 
8000 parts five separate times, each under a namespace of 'multipart'. 

Since the multipart upload was aborted, I can't remove the object by name via 
the S3 interface. Since my RGW stripe size is 4MiB, I know that each part of 
the object will be stored across 4 entries in the .rgw.buckets pool -- 4 MiB in 
a 'multipart' file, and 4, 4, and 3 MiB in three successive 'shadow' files. 
I've created a script to remove these parts (rados -p .rgw.buckets rm 
<bucket_id>__multipart_<object+prefix>.<part> and rados -p .rgw.buckets rm 
<bucket_id>__shadow_<object+prefix>.<part>.[1-3]). The removes are completing 
successfully (in that additional attempts to remove the object result in a 
failure), but I'm not seeing any decrease in the bucket's space used, nor am I 
seeing a decrease in the bucket's object count. In fact, if I do another 
'bucket list', all of the removed parts are still included. 

I've looked at the output of 'gc list --include-all', and the removed parts are 
never showing up for garbage collection. Garbage collection is otherwise 
functioning normally and will successfully remove data for any object properly 
removed via the S3 interface. 

I've also gone so far as to write a script to list the contents of bucket 
shards in the .rgw.buckets.index pool, check for the existence of the entry in 
.rgw.buckets, and remove entries that cannot be found, but that is also failing 
to decrement the size/object count counters. 

What am I missing here? Where, aside from .rgw.buckets and .rgw.buckets.index 
is RGW looking to determine object count and space used for a bucket? 

Many thanks to any and all who can assist. 

Brian Felton 



_______________________________________________ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cleaning Up Failed Multipart Uploads

Reply via email to