On Tue, Dec 15, 2009 at 3:06 PM, Kjetil Torgrim Homme <kjeti...@linpro.no> wrote: > Robert Milkowski <mi...@task.gda.pl> writes: >> On 13/12/2009 20:51, Steve Radich, BitShop, Inc. wrote: >>> Because if you can de-dup anyway why bother to compress THEN check? >>> This SEEMS to be the behaviour - i.e. I would suspect many of the >>> files I'm writing are dups - however I see high cpu use even though >>> on some of the copies I see almost no disk writes. >> >> First, the checksum is calculated after compression happens. > > for some reason I, like Steve, thought the checksum was calculated on > the uncompressed data, but a look in the source confirms you're right, > of course. > > thinking about the consequences of changing it, RAID-Z recovery would be > much more CPU intensive if hashing was done on uncompressed data --
I don't quite see how dedupe (based on sha256) and parity (based on crc32) are related. Regards, Andrey > every possible combination of the N-1 disks would have to be > decompressed (and most combinations would fail), and *then* the > remaining candidates would be hashed to see if the data is correct. > > this would be done on a per recordsize basis, not per stripe, which > means reconstruction would fail if two disk blocks (512 octets) on > different disks and in different stripes go bad. (doing an exhaustive > search for all possible permutations to handle that case doesn't seem > realistic.) > > in addition, hashing becomes slightly more expensive since more data > needs to be hashed. > > overall, my guess is that this choice (made before dedup!) will give > worse performance in normal situations in the future, when dedup+lzjb > will be very common, at a cost of faster and more reliable resilver. in > any case, there is not much to be done about it now. > > -- > Kjetil T. Homme > Redpill Linpro AS - Changing the game > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss