Just another data point.  The ddt is considered metadata, and by default the 
arc will not allow more than 1/4 of it to be used for metadata.   Are you still 
sure it fits?

Erik Trimble <erik.trim...@oracle.com> wrote:

>On 5/7/2011 6:47 AM, Edward Ned Harvey wrote:
>>> See below.  Right around 400,000 blocks, dedup is suddenly an order of
>>> magnitude slower than without dedup.
>>>
>>> 400000              10.7sec         136.7sec        143 MB          195
>> MB
>>> 800000              21.0sec         465.6sec        287 MB          391
>> MB
>>
>> The interesting thing is - In all these cases, the complete DDT and the
>> complete data file itself should fit entirely in ARC comfortably.  So it
>> makes no sense for performance to be so terrible at this level.
>>
>> So I need to start figuring out exactly what's going on.  Unfortunately I
>> don't know how to do that very well.  I'm looking for advice from anyone -
>> how to poke around and see how much memory is being consumed for what
>> purposes.  I know how to lookup c_min and c and c_max...  But that didn't do
>> me much good.  The actual value for c barely changes at all over time...
>> Even when I rm the file, c does not change immediately.
>>
>> All the other metrics from kstat ... have less than obvious names ... so I
>> don't know what to look for...
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>Some minor issues that might affect the above:
>
>(1) I'm assuming you run your script repeatedly in the same pool, 
>without deleting the pool. If that is the case, that means that a run of 
>X+1 should dedup completely with the run of X.  E.g. a run with 120000 
>blocks will dedup the first 110000 blocks with the prior run of 110000.
>
>(2) can you NOT enable "verify" ?  Verify *requires* a disk read before 
>writing for any potential dedup-able block. If case #1 above applies, 
>then by turning on dedup, you *rapidly* increase the amount of disk I/O 
>you require on each subsequent run.  E.g. the run of 100000 requires no 
>disk I/O due to verify, but the run of 110000 requires 100000 I/O 
>requests, while the run of 120000 requires 110000 requests, etc.  This 
>will skew your results as the ARC buffering of file info changes over time.
>
>(3) fflush is NOT the same as fsync.  If you're running the script in a 
>loop, it's entirely possible that ZFS hasn't completely committed things 
>to disk yet, which means that you get I/O requests to flush out the ARC 
>write buffer in the middle of your runs.   Honestly, I'd do the 
>following for benchmarking:
>
>         i=0
>         while [i -lt 80 ];
>         do
>             j = $[100000 + ( 1  * 10000)]
>             ./run_your_script j
>             sync
>             sleep 10
>             i = $[$i+1]
>     done
>
>
>
>-- 
>Erik Trimble
>Java System Support
>Mailstop:  usca22-123
>Phone:  x17195
>Santa Clara, CA
>
>_______________________________________________
>zfs-discuss mailing list
>zfs-discuss@opensolaris.org
>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to