Hardware: Supermicro server with Adaptec 5405 SAS controller, LSI expander -> 
24 drives. Currently using 2x 1tb SAS drives striped and 1x750gb SATA as 
another pool. I don't think hardware is related though as if I turn off zfs 
compression it's fine - I seem to get same behavior on either pool. The ONLY 
thing I can think of distinct is I use a USB flash drive for root, performance 
on root pool is horrible but system works fine. 

If I do a copy with ZFS compression=gzip-9 then I'll get Solaris hung for 
several seconds.  I have iostat -xcnCXTdz 5 running, so it SHOULD be displaying 
stats every 5 seconds. The results: 06:01:20, then 06:02:04 (44 seconds).

Thu May 22 06:01:20 2008
     cpu
 us sy wt id
  0 13  0 86
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  253.5    0.0 16524.7    0.0  0.0 14.1    0.0   55.6   0  55 c4
  121.0    0.0 8140.8    0.0  0.0  8.5    0.0   70.2   0  30 c4t0d0
  132.6    0.0 8383.9    0.0  0.0  5.6    0.0   42.2   0  25 c4t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 fd0
Thu May 22 06:02:04 2008
     cpu
 us sy wt id
  0 98  0  2
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   42.4   38.7 2590.2 2752.5  0.0  1.9    0.0   24.0   0   8 c4
   21.5   19.1 1313.4 1353.3  0.0  1.1    0.0   26.2   0   4 c4t0d0
   20.8   19.5 1276.9 1399.2  0.0  0.9    0.0   21.7   0   4 c4t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t0d0
    0.0    0.0    0.9    0.0  0.0  0.0    0.1   11.8   0   0 c6
    0.0    0.0    0.9    0.0  0.0  0.0    0.1   11.8   0   0 c6t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 fd0
Thu May 22 06:02:09 2008
     cpu
 us sy wt id
  0  6  0 94
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   27.4  249.4 2164.0 14078.1  0.0 68.9    0.0  249.1   0 200 c4
   15.0  128.8 1238.5 7252.9  0.0 34.3    0.0  238.8   0 100 c4t0d0
   12.4  120.6  925.5 6825.2  0.0 34.6    0.0  260.1   0 100 c4t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 fd0
Thu May 22 06:02:16 2008
     cpu
 us sy wt id
  0 82  0 18
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   54.4   14.8 3907.3  558.2  0.0  9.0    0.0  129.7   0  41 c4
   26.0    7.2 1891.3  282.6  0.0  4.2    0.0  126.7   0  18 c4t0d0
   28.3    7.6 2016.0  275.6  0.0  4.8    0.0  132.5   0  22 c4t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 fd0

I notice the copy is still going, but I'm back to semi-responsive possibly when 
the second file starts (7 seconds instead of 5). This seems like the 
compression thread(s) are too high of priority.

The files I'm copying for my test are:
-rw-r--r--  1 root root 2240902488 2008-05-21 19:32 it-20080106.zfs
-rw-r--r--  1 root root 1381914720 2008-05-21 19:40 it-20080131.zfs

They are zfs send logs, so pretty large. They are also compressed. 

What concerns me about this isn't that I've successfully overloaded the cpu, 
that's to be expected - But that NOTHING seems to run at that point. The 
scheduler IMHO should be taking care of other requests instead of giving zfs 
compression all the cpu - i.e. if I try to ssh to the box I can't log in while 
this runs, for almost a minute - it's just unresponsive. I didn't test enough 
other things, but I assume the entire system is hung.

I also noticed (perhaps by design) that a copy with compression off almost 
instantly returns, but the writes continue LONG after the cp process claims to 
be done. Is this normal? Wouldn't closing the file ensure it was written to 
disk? Is that tunable somewhere?
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to