On 10/22/2010 8:44 PM, Haudy Kazemi wrote:
Never Best wrote:
Sorry I couldn't find this anywhere yet. For deduping it is best to
have the lookup table in RAM, but I wasn't too sure how much RAM is
suggested?
::Assuming 128KB Block Sizes, and 100% unique data:
1TB*1024*1024*1024/128 = 8388608 Blocks
::Each Block needs 8 byte pointer?
8388608*8 = 67108864 bytes
::Ram suggest per TB
67108864/1024/1024 = 64MB
So if I understand correctly we should have a min of 64MB RAM per TB
for deduping? *hopes my math wasn't way off*, or is there significant
extra overhead stored per block for the lookup table? For example is
there some kind of redundancy on the lookup table (relation to RAM
space requirments) to counter corruption?
I read some articles and they all mention that there is significant
performance loss if the table isn't in RAM, but none really mentioned
how much RAM one should have per TB of duping.
Thanks, hope someone can confirm *or give me the real numbers* for
me. I know blocksize is variable; I'm most interessted in the
default zfs setup right now.
There were several detailed discussions about this over the past 6
months that should be in the archives. I believe most of the info
came from Richard Elling.
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Look for both my name and Richard's, going back about a year. In
particular, this thread started out a good data flow:
http://www.mail-archive.com/[email protected]/msg35349.html
bottom line: 270 bytes per record
so, for 4k record size, that works out to be 67GB per 1 TB of unique
data. 128k record size means about 2GB per 1 TB.
dedup means buy a (big) SSD for L2ARC.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss