On Jan 9, 2020, at 9:27 AM, Stefan Kooman <ste...@bit.nl<mailto:ste...@bit.nl>> 
wrote:

Quoting Kyriazis, George 
(george.kyria...@intel.com<mailto:george.kyria...@intel.com>):


On Jan 9, 2020, at 8:00 AM, Stefan Kooman <ste...@bit.nl<mailto:ste...@bit.nl>> 
wrote:

Quoting Kyriazis, George 
(george.kyria...@intel.com<mailto:george.kyria...@intel.com>):

The source pool has mainly big files, but there are quite a few
smaller (<4KB) files that I’m afraid will create waste if I create the
destination zpool with ashift > 12 (>4K blocks).  I am not sure,
though, if ZFS will actually write big files in consecutive blocks
(through a send/receive), so maybe the blocking factor is not the
actual file size, but rather the zfs block size.  I am planning on
using zfs gzip-9 compression on the destination pool, if it matters.

You might want to consider Zstandard for compression:
https://engineering.fb.com/core-data/smaller-and-faster-data-compression-with-zstandard/

Thanks for the pointer.  Sorry, I am not sure how you are suggesting
to using zstd, since it’s not part of the standard zfs compression
algorithms.

It's in FreeBSD ... and should be in ZOL soon:
https://github.com/zfsonlinux/zfs/pull/9735

FreeNAS is based on FreeBSD, so it will make it there…. eventually.  But 
compression is not my problem, I have enough horsepower to deal with gzip-9.  
It’s not the bottlenneck.  Ceph file I/O is.

You can optimize a ZFS fs to use larger blocks for those files that are
small ... and use large block sizes for other fs ... if it's easy to
split them.

From what I understand, zfs uses a single block per file, if files are
<4K, ie. It does not put 2 small files in a single block.  How would
larger blocks help small files?  Also, as far as I know ashift is a
pool property, set only at pool creation.

Hmm, I meant you can use large block size for the large files and small
block size for the small files.

Sure, but how to do that.  As far as I know block size is a property of the 
pool, not a single file.

Thanks!

George


I don’t have control over the original files and how they are stored
in the source server.  These are user’s files.

Then you somehow need to find a middle ground.

Gr. Stefan

--
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / 
i...@bit.nl<mailto:i...@bit.nl>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to