You bet, glad to help.
Zillions of small files indeed present a relatively higher metadata overhead,
and can be problematic in multiple ways. When using RGW, indexless buckets may
be advantageous.
Another phenomenon is space amplification — with say a 1 GB file/object, a
partially full
thanks Anthony and Janneexactly what I have been looking for!
On Fri, Feb 25, 2022 at 9:25 AM Janne Johansson wrote:
> Den fre 25 feb. 2022 kl 08:49 skrev Anthony D'Atri <
> anthony.da...@gmail.com>:
> > There was a similar discussion last year around Software Heritage’s
> archive project,
Den fre 25 feb. 2022 kl 08:49 skrev Anthony D'Atri :
> There was a similar discussion last year around Software Heritage’s archive
> project, suggest digging up that thread.
> Some ideas:
>
> * Pack them into (optionally compressed) tarballs - from a quick search it
> sorta looks like HAR uses a
There was a similar discussion last year around Software Heritage’s archive
project, suggest digging up that thread.
Some ideas:
* Pack them into (optionally compressed) tarballs - from a quick search it
sorta looks like HAR uses a similar model. Store the tarballs as RGW objects,
or as RBD