[CentOS] 3Ware 9550SX and latency/system responsiveness

Simon Banton Thu, 13 Sep 2007 05:18:14 -0700

Dear list,

I thought I'd just share my experiences with this 3Ware card, and seeif anyone might have any suggestions.

System: Supermicro H8DA8 with 2 x Opteron 250 2.4GHz and 4GB RAMinstalled. 9550SX-8LP hosting 4x Seagate ST3250820SV 250GB in a RAID1 plus 2 hot spare config. The array is properly initialized, writecache is on, as is queueing (and supported by the drives). StoreSaveset to Protection.

OS is CentOS 4.5 i386, minimal install, default partitioning assuggested by the installer (ext3, small /boot on /dev/sda1, remainderas / on LVM VolGroup with 2GB swap).


Firmware from 3Ware codeset 9.4.1.2 in use, firmware/driver details:
//serv1> /c0 show all
/c0 Driver Version = 2.26.05.007
/c0 Model = 9550SX-8LP
/c0 Memory Installed  = 112MB
/c0 Firmware Version = FE9X 3.08.02.005
/c0 Bios Version = BE9X 3.08.00.002
/c0 Monitor Version = BL9X 3.01.00.006

I initially noticed something odd while installing 4.4, that writingthe inode tables took a longer time than I expected (I thought theinstaller had frozen) and the system overall felt sluggish when doingits first yum update, certainly more sluggish than I'd expect with acomparatively powerful machine and hardware RAID 1.

I tried a few simple benchmarks (bonnie++, iozone, dd) and noticed upto 8 pdflush commands hanging about in uninterruptible sleep whenwriting to disk, along with kjournald and kswapd from time to time.Loadave during writing climbed considerably (up to >12) with 'ls'taking up to 30 seconds to give any output. I've tried CentOS 4.4,4.5, RHEL AS 4 update 5 (just in case) and openSUSE 10.2 and they allshow the same symptoms.

Googling around makes me think that this may be related to queuedepth, nr_requests and possibly VM params (the latter fromhttps://bugzilla.redhat.com/show_bug.cgi?id=121434#c275). These arethe default settings:


/sys/block/sda/device/queue_depth = 254
/sys/block/sda/queue/nr_requests = 8192
/proc/sys/vm/dirty_expire_centisecs = 3000
/proc/sys/vm/dirty_ratio = 30

3Ware mentions elevator=deadline, blockdev --setra 16384 along withnr_requests=512 in their performance tuning doc - these alone seem tomake no difference to the latency problem.

Setting dirty_expire_centisecs = 1000 and dirty_ratio = 5 does indeedreduce the number of processes in 'b' state as reported by vmstat 1during an iozone benchmark (./iozone -s 20480m -r 64 -i 0 -i 1 -t 1-b filename.xls as per 3Ware's own tuning doc) but the problem isobviously still there, just mitigated somewhat. The comparison graphsare in a PDF here:http://community.novacaster.com/attach.pl/7411/482/iozone_vm_tweaks_xls.pdfIncidentally, the vmstat 1 output was directed to an NFS-mounted diskto avoid writing it to the arry during the actual testing.

I've tried eliminating LVM from the equation, going to ext2 ratherthan ext3 and booting single-processor all to no useful effect. I'vealso tried benchmarking with different blocksizes from 512B to 1M inpowers of 2 and the problem remains - many processes inuninterruptible sleep blocking other IO. I'm about to startdownloading CentOS 5 to give it a go, and after that I might have toresort to seeing if WinXP has the same issue.

My only real question is "where do I go from here?" I don't haveenough specific tuning knowledge to know what else to look at.


Thanks for any pointers.

Simon
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] 3Ware 9550SX and latency/system responsiveness

Reply via email to