Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

2011-05-07 Thread Erik Trimble

On 5/7/2011 6:47 AM, Edward Ned Harvey wrote:

See below.  Right around 400,000 blocks, dedup is suddenly an order of
magnitude slower than without dedup.

40  10.7sec 136.7sec143 MB  195

MB

80  21.0sec 465.6sec287 MB  391

MB

The interesting thing is - In all these cases, the complete DDT and the
complete data file itself should fit entirely in ARC comfortably.  So it
makes no sense for performance to be so terrible at this level.

So I need to start figuring out exactly what's going on.  Unfortunately I
don't know how to do that very well.  I'm looking for advice from anyone -
how to poke around and see how much memory is being consumed for what
purposes.  I know how to lookup c_min and c and c_max...  But that didn't do
me much good.  The actual value for c barely changes at all over time...
Even when I rm the file, c does not change immediately.

All the other metrics from kstat ... have less than obvious names ... so I
don't know what to look for...

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Some minor issues that might affect the above:

(1) I'm assuming you run your script repeatedly in the same pool, 
without deleting the pool. If that is the case, that means that a run of 
X+1 should dedup completely with the run of X.  E.g. a run with 12 
blocks will dedup the first 11 blocks with the prior run of 11.


(2) can you NOT enable "verify" ?  Verify *requires* a disk read before 
writing for any potential dedup-able block. If case #1 above applies, 
then by turning on dedup, you *rapidly* increase the amount of disk I/O 
you require on each subsequent run.  E.g. the run of 10 requires no 
disk I/O due to verify, but the run of 11 requires 10 I/O 
requests, while the run of 12 requires 11 requests, etc.  This 
will skew your results as the ARC buffering of file info changes over time.


(3) fflush is NOT the same as fsync.  If you're running the script in a 
loop, it's entirely possible that ZFS hasn't completely committed things 
to disk yet, which means that you get I/O requests to flush out the ARC 
write buffer in the middle of your runs.   Honestly, I'd do the 
following for benchmarking:


i=0
while [i -lt 80 ];
do
j = $[10 + ( 1  * 1)]
./run_your_script j
sync
sleep 10
i = $[$i+1]
done



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

2011-05-07 Thread Edward Ned Harvey
> See below.  Right around 400,000 blocks, dedup is suddenly an order of
> magnitude slower than without dedup.
> 
> 4010.7sec 136.7sec143 MB  195
MB
> 8021.0sec 465.6sec287 MB  391
MB

The interesting thing is - In all these cases, the complete DDT and the
complete data file itself should fit entirely in ARC comfortably.  So it
makes no sense for performance to be so terrible at this level.

So I need to start figuring out exactly what's going on.  Unfortunately I
don't know how to do that very well.  I'm looking for advice from anyone -
how to poke around and see how much memory is being consumed for what
purposes.  I know how to lookup c_min and c and c_max...  But that didn't do
me much good.  The actual value for c barely changes at all over time...
Even when I rm the file, c does not change immediately.

All the other metrics from kstat ... have less than obvious names ... so I
don't know what to look for...

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

2011-05-07 Thread Edward Ned Harvey
New problem:

I'm following all the advice I summarized into the OP of this thread, and
testing on a test system.  (A laptop).  And it's just not working.  I am
jumping into the dedup performance abyss far, far eariler than predicted...


My test system is a laptop with 1.5G ram, c_min =150M, c_max =1.2G
I have just a single sata 7.2krpm hard drive, no SSD.
Before I start, I have 1G free ram (according to top.)  
According to everything we've been talking about, I expect roughly 1G
divided by 376 bytes = 2855696 (2.8M) blocks in my pool before I start
running out of ram to hold the DDT and performance degrades.

I create a pool.  Enable dedup.  Set recordsize=512
I write a program that will very quickly generate unique non-dedupable data:
#include 
#include 
int main(int argc, char *argv[]) {
int i;
int numblocks=atoi(argv[1]);
// Note: Expect one command-line argument integer.
FILE *outfile;
outfile=fopen("junk.file","w");
for (i=0; ihttp://mail.opensolaris.org/mailman/listinfo/zfs-discuss