> Assuming that you may pick a specific compression algorithm, > most algorithms can have different levels/percentages of > deflations/inflations which affects the time to compress > and/or inflate wrt the CPU capacity.
Yes? I'm not sure what your point is. Are you suggesting that, rather than hard-coding (for instance) the nine "gzip1/gzip2/.../gzip9" alternatives, it would be useful to have a "gzip" setting with a compression level? That might make some sense, but in practice, there's a limited number of compression algorithms and limited utility for setting the degree of compression, so the current approach doesn't seem to sacrifice much. (If you get into more complex compression algorithms, there are more knobs to tweak, too; and it doesn't seem particularly useful to expose all of those.) > Secondly, if I can add an additional item, would anyone > want to be able to encrypt the data vs compress Yes, and I think Darren Moffat is working on it. Encryption & compression are orthogonal, though. (The only constraint is that it's far preferable to compress first, then encrypt, since compression relies on regularity in the data stream which encryption removes.) > Third, if data were to be compressed within a file > object, should a reader be made aware that the data > being read is compressed or should he just read > garbage? I don't understand your question here. Compression is transparent, so a reader will get back exactly what was written. Both the compression and decompression happen automatically. (There's a separate issue that backup applications would like to be able to read the compressed data directly; I haven't paid attention to see if there's an ioctl to enable this yet.) > Fourth, if you take 8k and expect to alloc 8k of disk > block storage for it and compress it to 7k, are you > really saving 1k? Or are you just creating an additional > 1K of internal fragmentation? You're really saving 1K, because the disk space is not allocated until after the compression step. Remember, ZFS uses variably-sized blocks. In your example, you'll allocate a 1K block which happens to hold 8K worth of the user's data. > Fifth and hopefully last, should the znode have a > new length field that keeps the non-compressed length > for Posix compatibility. With this & your third question, I think you've got a fundamental misunderstanding of what the compression in ZFS does. It is transparent to the application. The application reads & writes uncompressed data, it sees uncompressed files, it doesn't even have any way to know that the file has been compressed (except for looking at stat data & counting the blocks used). > Really last..., why not just compress the data > a stream > before writing it out to disk? Then you can at least > t do > a file on it and identify the type of compression... This is preferable when the application supports it, because it allows you to compress the whole file at once and get better compression ratios, choose an appropriate compression algorithm, not try to compress incompressible data, etc. However, it's less general, since it requires that the application do the compression. If you have existing applications which only deal with uncompressed data, then having the file system do the compression is useful. This isn't exactly new. Stak did this for Windows (at the disk level, not the file system level) in the 1980s. File system level compression came in around the same time (DiskDoubler and StuffIt SpaceSaver on the Mac, for instance). Windows NTFS has built-in compression, but it compresses the whole file, rather than individual blocks. (Better compression, but the performance isn't as good if you're only reading a small portion of the file.) Anton This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss