> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
> 
> When you read back duplicate data that was previously written with
> dedup, then you get a lot more cache hits, and as a result, the reads go
> faster.  Unfortunately these gains are diminished...  I don't know by
> what...  But you only have about 2x to 4x performance gain reading
> previously dedup'd data, as compared to reading the same data which was
> never dedup'd.  Even when repeatedly reading the same file which is 100%
> duplicate data (created by dd from /dev/zero) so all the data is 100% in
> cache...   I still see only 2x to 4x performance gain with dedup.

For what it's worth:

I also repeated this without dedup.  Created a large file (17G, just big
enough that it will fit entirely in my ARC).  Rebooted.  Timed reading it.
Now it's entirely in cache.  Time reading it again.

When it's not cached, of course the read time was equal to the original
write time.  When it's cached, it goes 4x faster.  Perhaps this is only
because I'm testing on a machine that has super fast storage...  11 striped
SAS disks yielding 8Gbit/sec as compared to all-RAM which yielded
31.2Gbit/sec.  It seems in this case, RAM is only 4x faster than the storage
itself...  But I would have expected a couple orders of magnitude...  So
perhaps my expectations are off, or the ARC itself simply incurs overhead.
Either way, dedup is not to blame for obtaining merely 2x or 4x performance
gain over the non-dedup equivalent.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to