Just another data point. The ddt is considered metadata, and by default the arc will not allow more than 1/4 of it to be used for metadata. Are you still sure it fits?
Erik Trimble <erik.trim...@oracle.com> wrote: >On 5/7/2011 6:47 AM, Edward Ned Harvey wrote: >>> See below. Right around 400,000 blocks, dedup is suddenly an order of >>> magnitude slower than without dedup. >>> >>> 400000 10.7sec 136.7sec 143 MB 195 >> MB >>> 800000 21.0sec 465.6sec 287 MB 391 >> MB >> >> The interesting thing is - In all these cases, the complete DDT and the >> complete data file itself should fit entirely in ARC comfortably. So it >> makes no sense for performance to be so terrible at this level. >> >> So I need to start figuring out exactly what's going on. Unfortunately I >> don't know how to do that very well. I'm looking for advice from anyone - >> how to poke around and see how much memory is being consumed for what >> purposes. I know how to lookup c_min and c and c_max... But that didn't do >> me much good. The actual value for c barely changes at all over time... >> Even when I rm the file, c does not change immediately. >> >> All the other metrics from kstat ... have less than obvious names ... so I >> don't know what to look for... >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >Some minor issues that might affect the above: > >(1) I'm assuming you run your script repeatedly in the same pool, >without deleting the pool. If that is the case, that means that a run of >X+1 should dedup completely with the run of X. E.g. a run with 120000 >blocks will dedup the first 110000 blocks with the prior run of 110000. > >(2) can you NOT enable "verify" ? Verify *requires* a disk read before >writing for any potential dedup-able block. If case #1 above applies, >then by turning on dedup, you *rapidly* increase the amount of disk I/O >you require on each subsequent run. E.g. the run of 100000 requires no >disk I/O due to verify, but the run of 110000 requires 100000 I/O >requests, while the run of 120000 requires 110000 requests, etc. This >will skew your results as the ARC buffering of file info changes over time. > >(3) fflush is NOT the same as fsync. If you're running the script in a >loop, it's entirely possible that ZFS hasn't completely committed things >to disk yet, which means that you get I/O requests to flush out the ARC >write buffer in the middle of your runs. Honestly, I'd do the >following for benchmarking: > > i=0 > while [i -lt 80 ]; > do > j = $[100000 + ( 1 * 10000)] > ./run_your_script j > sync > sleep 10 > i = $[$i+1] > done > > > >-- >Erik Trimble >Java System Support >Mailstop: usca22-123 >Phone: x17195 >Santa Clara, CA > >_______________________________________________ >zfs-discuss mailing list >zfs-discuss@opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss