roland:
i created two zfs filesystems based on image-files used as devices  -
i.e. i created them on top of two empty files, exactly the same size.

then i enabled compression on one of them (zfs set compression=on compressedzfs)

after copying a large file to both filesystems, i unmounted them,
exported them and did a gzip on those zfs imagefiles.

after gzipping, the imagefile with compression=on is nearly twice as
big as the imagefile with compression=off.

this is something i wouldn`t have expected.

Tomas Ă–gren wrote:
The compression used in ZFS isn't as good as in gzip, because that would
take too much cpu (and I've heard they just snatched the code from the
kernel panic crash dump thing which isn't allowed to allocate memory for
instance).. It's a simple form of Lempel-Ziv.. instead it's a "quick and
kinda good" compression algorithm.. Before compression, data could be
quite compressible (either with a fast & larger-endresult or slow &
smaller-endresult).. but after compression (even from a non-ideal
algorithm), the data is close to random data which is quite hard to
re-compress..

Try the difference between zfs -> zfs+gzip  vs   gzip -> gzip+gzip..

Another factor that contributes to the difference is that the ZFS compression is per-block. Gzip remembers patterns throughout the
whole file, resulting in much smaller storage for later repetitions.

This principle has been used for comparing text bodies for style
analysis and text recognition. Compressing a sample in combination with a larger corpus representative for a particular source (author, style) will result in smaller files for samples that better match the source.


--
Henk Langeveld <[EMAIL PROTECTED]>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to