Hello,

we have a problem with our disk array. It might be even in HW, I'm not sure.
The array holds home directories of our users + mail.

HW configuration:

a HP DL585 server, with four 6-core Opterons, 128GiB RAM

array: IBM DS4300 with 7 LUNs, each a RAID5 with 4 disks (250GB).
Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA
  (the array only supports 2Gb)
NCQ queue depth is 32

SW configuration:

RHEL5.3

the home partition is a linear LVM volume:

# lvdisplay -m /dev/array-vg/newhome
  --- Logical volume ---
  LV Name                /dev/array-vg/newhome
  VG Name                array-vg
  LV UUID                9XxWH5-5yv4-t661-K24d-Hdzg-G0aW-zUxRul
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                2.18 TB
  Current LE             571393
  Segments               9
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:7
 
  --- Segments ---
  Logical extent 0 to 66998:
    Type                linear
    Physical volume     /dev/sda
    Physical extents    111470 to 178468
 
  Logical extent 66999 to 133997:
    Type                linear
    Physical volume     /dev/sdb
    Physical extents    111470 to 178468
 
  Logical extent 133998 to 200996:
    Type                linear
    Physical volume     /dev/sdc
    Physical extents    111470 to 178468
 
  Logical extent 200997 to 267995:
    Type                linear
    Physical volume     /dev/sdd
    Physical extents    111470 to 178468
 
  Logical extent 267996 to 334994:
    Type                linear
    Physical volume     /dev/sde
    Physical extents    111470 to 178468
 
  Logical extent 334995 to 401993:
    Type                linear
    Physical volume     /dev/sdf
    Physical extents    111470 to 178468
 
  Logical extent 401994 to 468992:
    Type                linear
    Physical volume     /dev/sdg
    Physical extents    111470 to 178468
 
  Logical extent 468993 to 527946:
    Type                linear
    Physical volume     /dev/sdg
    Physical extents    15945 to 74898
 
  Logical extent 527947 to 571392:
    Type                linear
    Physical volume     /dev/sdc
    Physical extents    15945 to 59390

All LUNs use the deadline scheduler.

Now the problem: 
whenever there is a 'large' write (in the order of hundreds of megabytes),
the system load rises considerably.
Inspection using iostat shows that from something like this:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             373.00         8.00      7792.00          8       7792
sdb              11.00         8.00        80.00          8         80
sdc              13.00         8.00        96.00          8         96
sdd               9.00         8.00        80.00          8         80
sde              23.00         8.00       296.00          8        296
sdf               9.00         8.00        80.00          8         80
sdg               5.00         8.00        32.00          8         32

after a $ dd if=/dev/zero of=file bs=$((2**20)) count=128
it goes to this:

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0
sdc               0.00         0.00         0.00          0          0
sdd               0.00         0.00         0.00          0          0
sde              31.00         8.00     28944.00          8      28944
sdf               1.00         8.00         0.00          8          0
sdg               1.00         8.00         0.00          8          0

and when I generate some reads it goes from

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             171.00      3200.00      3448.00       3200       3448
sdb              24.00      3336.00        56.00       3336         56
sdc              17.00      3280.00        16.00       3280         16
sdd              15.00      3208.00        24.00       3208         24
sde              18.00      3200.00        56.00       3200         56
sdf              18.00      3192.00        40.00       3192         40
sdg              23.00      3184.00       144.00       3184        144

to

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               5.00       392.00        88.00        392         88
sdb               2.00       352.00         0.00        352          0
sdc               2.00       264.00         0.00        264          0
sdd               2.00       264.00         0.00        264          0
sde             277.00       560.00     38744.00        560      38744
sdf               2.00       264.00         0.00        264          0
sdg               1.00       296.00         0.00        296          0

It looks like the single write somehow cancels out all other requests.

Switching to a striped LVM volume would probably help, but the data migration 
would
be really painful for us.

Has anyone an idea where the problem might be? Any pointers would be 
appreciated.

Regards,
Jiri Novosad

_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to