Two caveats inline … On 1 Feb 2011, at 01:05, Garrett D'Amore wrote:
> On 01/31/11 04:48 PM, Roy Sigurd Karlsbakk wrote: >>> As I've said here on the list a few times earlier, the last on the >>> thread 'ZFS not usable (was ZFS Dedup question)', I've been doing some >>> rather thorough testing on zfs dedup, and as you can see from the >>> posts, it wasn't very satisfactory. The docs claim 1-2GB memory usage >>> per terabyte stored, ARC or L2ARC, but as you can read from the post, >>> I don't find this very likely. >>> >> Sorry about the initial post - it was wrong. The hardware configuration was >> right, but for initial tests, I use NFS, meaning sync writes. This obviously >> stresses the ARC/L2ARC more than async writes, but the result remains the >> same. >> >> With 140GB with of L2ARC on two X25-Ms and some 4GB partitions on the same >> devices, 4GB each, in a mirror, the write speed was reduced to something >> like 20% of the origian speed. This was with about 2TB used on the zpool >> with a single data stream, no parallelism whatsoever. Still with 8GB ARC and >> 140GB of L2ARC on two SSDs, this speed is fairly low. I could not see >> substantially high CPU or I/O load during this test. >> > > I would not expect good performance on dedup with write... dedup isn't going > to make write's fast - its something you want on a system with a lot of > duplicated data that sustain a lot of reads. (That said, highly duplicate > date with a DDT that fits entirely in RAM might see a benefit from not having > to write meta data frequently. But I suspect an SLOG here is going to be > critical to get good performance since you'll still have a lot of synchronous > meta data writes.) > > - Garrett There is one circumstance where the write operation could be an improvement, in a system with data which is highly de-dupable *and* undergoing heavy write load, it may be useful to forego the large write and instead convert into a smaller (and more frequent) small metadata write, SLOGs would then show more benefit and we'd release pressure on the back-end for thruput. On a system with a high read ratio, de-duped data currently would be quite efficient, but there is one pathology in current ZFS which impacts this somewhat, last time I looked each ARC ref to a de-duped block leads to a inflated ARC copy of the data, hence a highly ref'ed block (20x for instance), could exist 20x in an inflated state in ARC after read refs to each occurrence. De-dup of inflated data in ARC was a pending ZFS optimisation … Craig >> Vennlige hilsener / Best regards >> >> roy >> -- >> Roy Sigurd Karlsbakk >> (+47) 97542685 >> r...@karlsbakk.net >> http://blogg.karlsbakk.net/ >> -- >> I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det >> er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av >> idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate >> og relevante synonymer på norsk. >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Craig Morgan Cinnabar Solutions Ltd t: +44 (0)791 338 3190 f: +44 (0)870 705 1726 e: cr...@cinnabar-solutions.com w: www.cinnabar-solutions.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss