Hi all,

I have a ceph cluster with 4+2 EC used as a secondary storage system for offloading big files from another storage system. Even if most of the files are big (at least 50MB), we have also some small objects - less than 4MB each. The current storage usage is 358TB of raw data and 237TB of 'usable' data which means an overhead of 66%.

I was wondering if I can get more storage efficiency if I can get rid of all the small files by moving them on other storage systems.  My understanding is that every file is splitted into stripe_unit chunks and then mapped into ceph objects which have a size of 4MB per object. So if I have a file with a size of 1MB, the file will be splitted into 4 x 256KB chunks, then added another 2 x 256KB chunks as overhead and every chunk will be mapped into a ceph object of 4MB size. This means a 1MB file will be stored as 6 ceph objects i.e the storage usage will be 24MB. Not sure if my understanding is correct though...

 Do you have any suggestions on this topic ? Does it really worth it to move the small files from ceph ? If yes, what is the minimum file size which I can safely store in ceph without loosing too much storage ?

Thanks.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to