Re: [rhelv5-list] DS4300: high load in case of large writes

vincent Thu, 25 Mar 2010 11:09:05 -0700


Hello Jiri,

The high load may be caused by I/O wait (check with sar -u). At any case,30mb/s seems a little slow for an FC array of any kind..I don't know your DS-4300 at all but if you're using a SAN or an FC loopto connect to your array, here are (maybe) a few things you might want tolook for:


- What kind of disks are used in your DS4300? 10k or 15k rpm FC disks? Did
  you check how heavily used were your disks during transfers? (there
  should be software provided with the array to allow that, perhaps even
  an embedded webserver).

- Did you monitor your arrays Fibre Aadapter activity? (unless you're the
  sole user of the array and no other server can hit the same physical
  disks in which case you're most likely not overloading it).

- Do you have multiple paths from your server to your switch and/or to
  your array? (even if the array is only active/passive and 2gbps; having
  multiple paths provides redundancy and better performance with correct
  configuration).

- What kind of data is your FS holding (many little files, hundreds of
  thousands of file, etc..?). Tuning the FS or switching to a different FS
  type can help..

- If there is no bottleneck noticed above, then stripping might help
  (that's what we use here on active/active DMX arrays) but take care not
  to end up on the same physical disks at the array block level..

My 2c,

Vincent

On Wed, 24 Mar 2010, Jiri Novosad wrote:

Hello,

we have a problem with our disk array. It might be even in HW, I'm not sure.
The array holds home directories of our users + mail.

HW configuration:

a HP DL585 server, with four 6-core Opterons, 128GiB RAM

array: IBM DS4300 with 7 LUNs, each a RAID5 with 4 disks (250GB).
Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA
 (the array only supports 2Gb)
NCQ queue depth is 32

SW configuration:

RHEL5.3

the home partition is a linear LVM volume:

# lvdisplay -m /dev/array-vg/newhome
 --- Logical volume ---
 LV Name                /dev/array-vg/newhome
 VG Name                array-vg
 LV UUID                9XxWH5-5yv4-t661-K24d-Hdzg-G0aW-zUxRul
 LV Write Access        read/write
 LV Status              available
 # open                 1
 LV Size                2.18 TB
 Current LE             571393
 Segments               9
 Allocation             inherit
 Read ahead sectors     auto
 - currently set to     256
 Block device           253:7

 --- Segments ---
 Logical extent 0 to 66998:
   Type                linear
   Physical volume     /dev/sda
   Physical extents    111470 to 178468

 Logical extent 66999 to 133997:
   Type                linear
   Physical volume     /dev/sdb
   Physical extents    111470 to 178468

 Logical extent 133998 to 200996:
   Type                linear
   Physical volume     /dev/sdc
   Physical extents    111470 to 178468

 Logical extent 200997 to 267995:
   Type                linear
   Physical volume     /dev/sdd
   Physical extents    111470 to 178468

 Logical extent 267996 to 334994:
   Type                linear
   Physical volume     /dev/sde
   Physical extents    111470 to 178468

 Logical extent 334995 to 401993:
   Type                linear
   Physical volume     /dev/sdf
   Physical extents    111470 to 178468

 Logical extent 401994 to 468992:
   Type                linear
   Physical volume     /dev/sdg
   Physical extents    111470 to 178468

 Logical extent 468993 to 527946:
   Type                linear
   Physical volume     /dev/sdg
   Physical extents    15945 to 74898

 Logical extent 527947 to 571392:
   Type                linear
   Physical volume     /dev/sdc
   Physical extents    15945 to 59390

All LUNs use the deadline scheduler.

Now the problem:
whenever there is a 'large' write (in the order of hundreds of megabytes),
the system load rises considerably.
Inspection using iostat shows that from something like this:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             373.00         8.00      7792.00          8       7792
sdb              11.00         8.00        80.00          8         80
sdc              13.00         8.00        96.00          8         96
sdd               9.00         8.00        80.00          8         80
sde              23.00         8.00       296.00          8        296
sdf               9.00         8.00        80.00          8         80
sdg               5.00         8.00        32.00          8         32

after a $ dd if=/dev/zero of=file bs=$((2**20)) count=128
it goes to this:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0
sdc               0.00         0.00         0.00          0          0
sdd               0.00         0.00         0.00          0          0
sde              31.00         8.00     28944.00          8      28944
sdf               1.00         8.00         0.00          8          0
sdg               1.00         8.00         0.00          8          0

and when I generate some reads it goes from

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             171.00      3200.00      3448.00       3200       3448
sdb              24.00      3336.00        56.00       3336         56
sdc              17.00      3280.00        16.00       3280         16
sdd              15.00      3208.00        24.00       3208         24
sde              18.00      3200.00        56.00       3200         56
sdf              18.00      3192.00        40.00       3192         40
sdg              23.00      3184.00       144.00       3184        144

to

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               5.00       392.00        88.00        392         88
sdb               2.00       352.00         0.00        352          0
sdc               2.00       264.00         0.00        264          0
sdd               2.00       264.00         0.00        264          0
sde             277.00       560.00     38744.00        560      38744
sdf               2.00       264.00         0.00        264          0
sdg               1.00       296.00         0.00        296          0

It looks like the single write somehow cancels out all other requests.

Switching to a striped LVM volume would probably help, but the data migration 
would
be really painful for us.

Has anyone an idea where the problem might be? Any pointers would be 
appreciated.

Regards,
Jiri Novosad


_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list

Re: [rhelv5-list] DS4300: high load in case of large writes

Reply via email to