Re: [ceph-users] Adventures with large RGW buckets

EDH - Manuel Rios Fernandez Thu, 01 Aug 2019 18:11:11 -0700

HI Greg / Eric,

What about allow delete bucket object with a lifecycle policy?

You can actually put 1 day of object life, that task is done at cluster level. 
And them delete objects young than 1 day, and remove bucket.

That sometimes speed deletes as task is done by rgw's.

It should be like a background delete option, due deleting bucket of millions 
of objects take weeks.

Regards

-----Mensaje original-----
De: ceph-users <ceph-users-boun...@lists.ceph.com> En nombre de Gregory Farnum
Enviado el: jueves, 1 de agosto de 2019 22:48
Para: Eric Ivancich <ivanc...@redhat.com>
CC: Ceph Users <ceph-users@lists.ceph.com>; d...@ceph.io
Asunto: Re: [ceph-users] Adventures with large RGW buckets

On Thu, Aug 1, 2019 at 12:06 PM Eric Ivancich <ivanc...@redhat.com> wrote:
>
> Hi Paul,
>
> I’ll interleave responses below.
>
> On Jul 31, 2019, at 2:02 PM, Paul Emmerich <paul.emmer...@croit.io> wrote:
>
> How could the bucket deletion of the future look like? Would it be 
> possible to put all objects in buckets into RADOS namespaces and 
> implement some kind of efficient namespace deletion on the OSD level 
> similar to how pool deletions are handled at a lower level?
>
> I’ll raise that with other RGW developers. I’m unfamiliar with how RADOS 
> namespaces are handled.

I expect RGW could do this, but unfortunately deleting namespaces at the RADOS 
level is not practical. People keep asking and maybe in some future world it 
will be cheaper, but a namespace is effectively just part of the object name 
(and I don't think it's even the first thing they sort by for the key entries 
in metadata tracking!), so deleting a namespace would be equivalent to deleting 
a snapshot[1] but with the extra cost that namespaces can be created 
arbitrarily on every write operation (so our solutions for handling snapshots 
without it being ludicrously expensive wouldn't apply). Deleting a namespace 
from the OSD-side using map updates would require the OSD to iterate through 
just about all the objects they have and examine them for deletion.

Is it cheaper than doing over the network? Sure. Is it cheap enough we're 
willing to let a single user request generate that kind of cluster IO on an 
unconstrained interface? Absolutely not.
-Greg
[1]: Deleting snapshots is only feasible because every OSD maintains a sorted 
secondary index from snapid->set<objects>. This is only possible because 
snapids are issued by the monitors and clients cooperate in making sure they 
can't get reused after being deleted.
Namespaces are generated by clients and there are no constraints on their use, 
reuse, or relationship to each other. We could maybe work around these 
problems, but it'd be building a fundamentally different interface than what 
namespaces currently are.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Adventures with large RGW buckets

Reply via email to