Ok, just did some more testing on this machine to try to find where my 
bottlenecks are.  Something very odd is going on here.  As best I can tell 
there are two separate problems now:

- something is throttling network output to 10MB/s
- something is throttling zfs send to around 20MB/s

The network throughput I've verified with mbuffer:

1.  A quick mbuffer test from /dev/zero to /dev/null gave me 565MB/s.
2.  On a test server, mbuffer sending from /dev/zero on one machine to 
/dev/null on another gave me 37MB/s
3.  On the live server, mbuffer sending from /dev/zero to the same receiving 
machine gave me just under 10MB/s.

This looks very much like mbuffer is throttled on this machine, but I know NFS 
can give me 60-80MB/s.  Can anybody give me a clue as to what could be causing 
this?


And the disk performance is just as confusing.  Again I used a test server to 
provide a comparison, and this time used a zfs scrub with iostat to check the 
performance possible on the disks.

Live server:  5 sets of 3 way mirrors
Test server:  5 disk raid-z2

1.  On the Live server, zfs send to /dev/null via mbuffer reports a speed of 
21MB/s
     # zfs send [EMAIL PROTECTED] | mbuffer -s 128k -m 512M > /dev/null
2.  On the Test server, zfs send to /dev/null via mbuffer reports a speed of 
35MB/s
3.  On the Live server, zpool scrub and iostat report a peak of 3k iops, and 
283MB/s throughput.
4.  On the Test server, zpool scrub and iostat report a peak of 472 iops, and 
53MB/s throughput.

Surely the send and scrub operations should give similar results?  Why is zpool 
scrub running 10-15x faster than zfs send on the live server?

The iostat figures on the live server are particularly telling.

During a scrub (30s intervals):
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rc-pool      734G  1.55T  2.94K     41   189M   788K
  mirror     144G   320G    578      6  39.2M   166K
    c1t1d0      -      -    379      5  39.9M   166K
    c1t2d0      -      -    379      5  39.9M   166K
    c2t1d0      -      -    385      5  40.1M   166K
  mirror     147G   317G    633      2  37.8M   170K
    c1t3d0      -      -    389      2  38.7M   171K
    c2t2d0      -      -    393      2  38.9M   171K
    c2t0d0      -      -    384      2  38.9M   171K
  mirror     147G   317G    619      6  37.3M  57.5K
    c2t3d0      -      -    377      2  38.3M  57.9K
    c1t5d0      -      -    377      2  38.3M  57.9K
    c1t4d0      -      -    373      3  38.2M  57.9K
  mirror     148G   316G    638     10  37.6M  64.0K
    c2t4d0      -      -    375      4  38.5M  64.4K
    c2t5d0      -      -    386      6  38.2M  64.4K
    c1t6d0      -      -    384      6  38.2M  64.4K
  mirror     149G   315G    540      6  37.4M   164K
    c1t7d0      -      -    356      4  38.1M   164K
    c2t6d0      -      -    362      5  38.2M   164K
    c2t7d0      -      -    361      5  38.2M   164K
  c3d1p0      12K   504M      0      8      0   166K
----------  -----  -----  -----  -----  -----  -----

During a send (30s intervals):
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rc-pool      734G  1.55T    148     55  18.6M  1.71M
  mirror     144G   320G     25      6  3.15M   235K
    c1t1d0      -      -      8      3  1.02M   235K
    c1t2d0      -      -      7      3   954K   235K
    c2t1d0      -      -      9      3  1.19M   235K
  mirror     147G   317G     27      3  3.40M   203K
    c1t3d0      -      -      8      2  1.03M   203K
    c2t2d0      -      -      9      3  1.25M   203K
    c2t0d0      -      -      8      2  1.11M   203K
  mirror     147G   317G     32      2  4.12M   205K
    c2t3d0      -      -     11      1  1.45M   205K
    c1t5d0      -      -     10      1  1.34M   205K
    c1t4d0      -      -     10      1  1.34M   205K
  mirror     148G   316G     32      2  4.02M   201K
    c2t4d0      -      -     10      1  1.37M   201K
    c2t5d0      -      -      9      1  1.23M   201K
    c1t6d0      -      -     11      1  1.43M   201K
  mirror     149G   315G     31      6  3.89M   180K
    c1t7d0      -      -     11      2  1.45M   180K
    c2t6d0      -      -      8      2  1.10M   180K
    c2t7d0      -      -     10      2  1.35M   180K
  c3d1p0      12K   504M      0     34      0   727K
----------  -----  -----  -----  -----  -----  -----

Can anybody explain why zfs send could be so slow on one server?  Is anybody 
else able to compare their iostat results for a zfs send and zpool scrub to see 
if they also have such a huge difference between the figures?

thanks,

Ross
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to