On 5/4/2011 4:44 PM, Tim Cook wrote:
On Wed, May 4, 2011 at 6:36 PM, Erik Trimble <erik.trim...@oracle.com
<mailto:erik.trim...@oracle.com>> wrote:
On 5/4/2011 4:14 PM, Ray Van Dolson wrote:
On Wed, May 04, 2011 at 02:55:55PM -0700, Brandon High wrote:
On Wed, May 4, 2011 at 12:29 PM, Erik
Trimble<erik.trim...@oracle.com
<mailto:erik.trim...@oracle.com>> wrote:
I suspect that NetApp does the following to
limit their resource
usage: they presume the presence of some sort of
cache that can be
dedicated to the DDT (and, since they also control the
hardware, they can
make sure there is always one present). Thus, they
can make their code
AFAIK, NetApp has more restrictive requirements about how
much data
can be dedup'd on each type of hardware.
See page 29 of
http://media.netapp.com/documents/tr-3505.pdf - Smaller
pieces of hardware can only dedup 1TB volumes, and even
the big-daddy
filers will only dedup up to 16TB per volume, even if the
volume size
is 32TB (the largest volume available for dedup).
NetApp solves the problem by putting rigid constraints
around the
problem, whereas ZFS lets you enable dedup for any size
dataset. Both
approaches have limitations, and it sucks when you hit them.
-B
That is very true, although worth mentioning you can have
quite a few
of the dedupe/SIS enabled FlexVols on even the lower-end
filers (our
FAS2050 has a bunch of 2TB SIS enabled FlexVols).
Stupid question - can you hit all the various SIS volumes at once,
and not get horrid performance penalties?
If so, I'm almost certain NetApp is doing post-write dedup. That
way, the strictly controlled max FlexVol size helps with keeping
the resource limits down, as it will be able to round-robin the
post-write dedup to each FlexVol in turn.
ZFS's problem is that it needs ALL the resouces for EACH pool ALL
the time, and can't really share them well if it expects to keep
performance from tanking... (no pun intended)
On a 2050? Probably not. It's got a single-core mobile celeron CPU
and 2GB/ram. You couldn't even run ZFS on that box, much less
ZFS+dedup. Can you do it on a model that isn't 4 years old without
tanking performance? Absolutely.
Outside of those two 2000 series, the reason there are dedup limits
isn't performance.
--Tim
Indirectly, yes, it's performance, since NetApp has plainly chosen
post-write dedup as a method to restrict the required hardware
capabilities. The dedup limits on Volsize are almost certainly driven
by the local RAM requirements for post-write dedup.
It also looks like NetApp isn't providing for a dedicated DDT cache,
which means that when the NetApp is doing dedup, it's consuming the
normal filesystem cache (i.e. chewing through RAM). Frankly, I'd be
very surprised if you didn't see a noticeable performance hit during the
period that the NetApp appliance is performing the dedup scans.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss