On 10/22/2010 8:44 PM, Haudy Kazemi wrote:
Never Best wrote:
Sorry I couldn't find this anywhere yet. For deduping it is best to have the lookup table in RAM, but I wasn't too sure how much RAM is suggested?

::Assuming 128KB Block Sizes, and 100% unique data:
1TB*1024*1024*1024/128 = 8388608 Blocks
::Each Block needs 8 byte pointer?
8388608*8 = 67108864 bytes
::Ram suggest per TB
67108864/1024/1024 = 64MB

So if I understand correctly we should have a min of 64MB RAM per TB for deduping? *hopes my math wasn't way off*, or is there significant extra overhead stored per block for the lookup table? For example is there some kind of redundancy on the lookup table (relation to RAM space requirments) to counter corruption?

I read some articles and they all mention that there is significant performance loss if the table isn't in RAM, but none really mentioned how much RAM one should have per TB of duping.

Thanks, hope someone can confirm *or give me the real numbers* for me. I know blocksize is variable; I'm most interessted in the default zfs setup right now.
There were several detailed discussions about this over the past 6 months that should be in the archives. I believe most of the info came from Richard Elling.
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Look for both my name and Richard's, going back about a year. In particular, this thread started out a good data flow:

http://www.mail-archive.com/[email protected]/msg35349.html


bottom line: 270 bytes per record

so, for 4k record size, that works out to be 67GB per 1 TB of unique data. 128k record size means about 2GB per 1 TB.



dedup means buy a (big) SSD for L2ARC.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to