Re: [ceph-users] Speeding up garbage collection in RGW

Pavan Rallabhandi Tue, 25 Jul 2017 10:22:13 -0700

I’ve just realized that the option is present in Hammer (0.94.10) as well, you 
should try that.


From: Bryan Stillwell <bstillw...@godaddy.com>
Date: Tuesday, 25 July 2017 at 9:45 PM
To: Pavan Rallabhandi <prallabha...@walmartlabs.com>, 
"ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com>
Subject: EXT: Re: [ceph-users] Speeding up garbage collection in RGW

Unfortunately, we're on hammer still (0.94.10).  That option looks like it 
would work better, so maybe it's time to move the upgrade up in the schedule.

I've been playing with the various gc options and I haven't seen any speedups 
like we would need to remove them in a reasonable amount of time.

Thanks,
Bryan

From: Pavan Rallabhandi <prallabha...@walmartlabs.com>
Date: Tuesday, July 25, 2017 at 3:00 AM
To: Bryan Stillwell <bstillw...@godaddy.com>, "ceph-users@lists.ceph.com" 
<ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Speeding up garbage collection in RGW

If your Ceph version is >=Jewel, you can try the `--bypass-gc` option in 
radosgw-admin, which would remove the tails objects as well without marking 
them to be GCed.

Thanks,

On 25/07/17, 1:34 AM, "ceph-users on behalf of Bryan Stillwell" 
<ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com> on 
behalf of bstillw...@godaddy.com<mailto:bstillw...@godaddy.com>> wrote:

    I'm in the process of cleaning up a test that an internal customer did on 
our production cluster that produced over a billion objects spread across 6000 
buckets.  So far I've been removing the buckets like this:

    printf %s\\n bucket{1..6000} | xargs -I{} -n 1 -P 32 radosgw-admin bucket 
rm --bucket={} --purge-objects

    However, the disk usage doesn't seem to be getting reduced at the same rate 
the objects are being removed.  From what I can tell a large number of the 
objects are waiting for garbage collection.

    When I first read the docs it sounded like the garbage collector would only 
remove 32 objects every hour, but after looking through the logs I'm seeing 
about 55,000 objects removed every hour.  That's about 1.3 million a day, so at 
this rate it'll take a couple years to clean up the rest!  For comparison, the 
purge-objects command above is removing (but not GC'ing) about 30 million 
objects a day, so a much more manageable 33 days to finish.

    I've done some digging and it appears like I should be changing these 
configuration options:

    rgw gc max objs (default: 32)
    rgw gc obj min wait (default: 7200)
    rgw gc processor max time (default: 3600)
    rgw gc processor period (default: 3600)

    A few questions I have though are:

    Should 'rgw gc processor max time' and 'rgw gc processor period' always be 
set to the same value?

    Which would be better, increasing 'rgw gc max objs' to something like 1024, 
or reducing the 'rgw gc processor' times to something like 60 seconds?

    Any other guidance on the best way to adjust these values?

    Thanks,
    Bryan


    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Speeding up garbage collection in RGW

Reply via email to