On Nov 4, 2010, at 10:59 AM, Rob Cohen wrote: > I have read some conflicting things regarding the ZFs record size setting. > Could you guys verify/correct my these statements: > > (These reflect my understanding, not necessarily the facts!) > > 1) The ZFS record size in a zvol is the unit that dedup happens at. So, for > a volume that is shared to an NTFS machine, if the NTFS cluster size is > smaller than the zvol record size, dedup will get dramatically worse, since > it won't dedup clusters that are positioned differently in zvol records.
Not quite. Dedup happens on a per-block basis. For zvols, the volblocksize (recordsize) is fixed at zvol creation time, so all blocks in a zvol have the same size. The physical and logical size of the block are also used to determine if the blocks are identical. Clearly, two blocks with the same checksum but different sizes are not identical. Setting the volblocksize larger than the client application's minumum blocksize can decrease dedup effectiveness. > 2) For shared folders, the record size is the allocation unit size, so large > records can waste a substantial amount of space, in cases with lots of very > small files. This is different than a HW raid stripe size, which only > affects performance, not space usage. No. The recordsize in a file system is dynamic. Small files use the smallest possible recordsize. In practice, there is very little waste. If you are really concerned about that, enable compression. This is very different than "HW RAID" stripe size. The two have nothing in common: apples and oranges. > 3) Although small record sizes have a large RAM overhead for dedup tables, as > long as the dedup table working set fits in RAM, and the rest fits in L2ARC, > performance will be good. Dedup changes large I/Os into small I/Os. If your pool does not perform small I/Os well, then dedup can have a noticeable impact on performance. -- richard ZFS Tutorial at USENIX LISA'10 Conference next Monday www.RichardElling.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss