There was a similar discussion last year around Software Heritage’s archive 
project, suggest digging up that thread.

Some ideas:

* Pack them into (optionally compressed) tarballs - from a quick search it 
sorta looks like HAR uses a similar model.  Store the tarballs as RGW objects, 
or as RBD volumes, or on CephFS.
* Create conventional filesystems on RBD volumes, though depending on size and 
number you might have some space lost to padding.
* SeaweedFS looks like it has small object packing built in, use it instead (or 
on RBD volumes)
* I’ve been told that the Mass. Open Cloud folks had prototyped some sort of 
packing for RGW, but I’ve not been able to find any details or a contact.
* With any of these strategies, 30TB Intel / Solidigm QLC SSDs would be fine 
media to use.  With the right chassis and form factor, >1PB/RU raw capacity can 
be realized.  RUs == money ;)

— aad


> 
> 
> Hi,
> 
> Is there any archive utility in Ceph similar to Hadoop Archive Utility (HAR)? 
> Or in other words. how can one archive small files in Ceph?
> 
> Thanks
> 
> 
> _______________________________________________
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to