Re: Slow Soft-RAID 5 performance
koan wrote: > Are you sure about that chunk size? In you initial posting you show > /proc/mdstat reporting: > > "md2 : active raid5 sdc3[2] sda3[0] sdb3[1] > 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU]" > > Which would seem to state a 128K chunk, and thus with a 4k block size > you would need a stride of 32. Hi Koan, Yes, I'm sure... Those 128K chunk was my initial setup, before the enlightenment from http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html My reported test setup is by using 256K chunk. > > > > On 7/18/07, Rui Santos <[EMAIL PROTECTED]> wrote: >> koan wrote: >> > How did you create the ext3 filesystem? >> >> The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct >> option that should be passed trough to --stride is 64. >> Am I correct ? >> >> I've also tested ( after sending my first report ) with xfs. >> I've also increases readahead to 65535 on all HD's >> I've also increases the stripe_cache_size to 16384. >> >> I can now get ~100MB/sec... >> >> > >> > Did you use the appropriate --stride option as noted here: >> > http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) >> > - >> > To unsubscribe from this list: send the line "unsubscribe >> > linux-kernel" in >> > the body of a message to [EMAIL PROTECTED] >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > Please read the FAQ at http://www.tux.org/lkml/ >> > >> > >> > >> >> > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
koan wrote: Are you sure about that chunk size? In you initial posting you show /proc/mdstat reporting: md2 : active raid5 sdc3[2] sda3[0] sdb3[1] 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] Which would seem to state a 128K chunk, and thus with a 4k block size you would need a stride of 32. Hi Koan, Yes, I'm sure... Those 128K chunk was my initial setup, before the enlightenment from http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html My reported test setup is by using 256K chunk. On 7/18/07, Rui Santos [EMAIL PROTECTED] wrote: koan wrote: How did you create the ext3 filesystem? The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct option that should be passed trough to --stride is 64. Am I correct ? I've also tested ( after sending my first report ) with xfs. I've also increases readahead to 65535 on all HD's I've also increases the stripe_cache_size to 16384. I can now get ~100MB/sec... Did you use the appropriate --stride option as noted here: http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
Are you sure about that chunk size? In you initial posting you show /proc/mdstat reporting: "md2 : active raid5 sdc3[2] sda3[0] sdb3[1] 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU]" Which would seem to state a 128K chunk, and thus with a 4k block size you would need a stride of 32. On 7/18/07, Rui Santos <[EMAIL PROTECTED]> wrote: koan wrote: > How did you create the ext3 filesystem? The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct option that should be passed trough to --stride is 64. Am I correct ? I've also tested ( after sending my first report ) with xfs. I've also increases readahead to 65535 on all HD's I've also increases the stripe_cache_size to 16384. I can now get ~100MB/sec... > > Did you use the appropriate --stride option as noted here: > http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) > - > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
J.A. Magallón wrote: > On Wed, 18 Jul 2007 10:56:11 +0100, Rui Santos <[EMAIL PROTECTED]> wrote: > > >> Hi, >> >> I'm getting a strange slow performance behavior on a recently installed >> Server. Here are the details: >> >> > ... > >> I can get a write throughput of 60 MB/sec on each HD by issuing the >> command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / >> 4 )); sync`' >> >> > ... > >> The RAID device I'm testing on is /dev/md2. Now, by issuing the same >> command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); >> sync`' on the raid device mount point, I get the following speeds: >> With stripe_cache_size at default '265': 51 MB/sec >> With stripe_cache_size at '8192': 73 MB/sec >> >> > > I know many people consider this stupid, but can you post some hdparm -tT > data ? > Of course. Here's the output: NewServer-RD:~ # hdparm -tT /dev/md2 /dev/md2: Timing cached reads: 1738 MB in 2.00 seconds = 868.93 MB/sec Timing buffered disk reads: 444 MB in 3.01 seconds = 147.69 MB/sec NewServer-RD:~ # hdparm --direct -tT /dev/md2 /dev/md2: Timing O_DIRECT cached reads: 290 MB in 2.01 seconds = 144.05 MB/sec Timing O_DIRECT disk reads: 396 MB in 3.01 seconds = 131.75 MB/sec > The culprit can be the filesystem+pagecache, the md driver or the disk > driver, so I think trying just hdparm will show if the disk o md are > going nuts... > > In my case, I have a box with 2 raids, one with SCSI disks and one with > IDE ones. > > Some results: > > lsscsi: > [0:0:0:0]diskIBM DDYS-T18350N S96H /dev/sda > [2:0:0:0]diskSEAGATE ST336807LW 0C01 /dev/sdb > [2:0:1:0]diskSEAGATE ST336807LW 0C01 /dev/sdc > [2:0:2:0]diskSEAGATE ST336807LW 0C01 /dev/sdd > [2:0:3:0]diskSEAGATE ST336807LW 0C01 /dev/sde > [3:0:0:0]diskATA ST3120022A 3.06 /dev/sdf > [3:0:1:0]cd/dvd HL-DT-ST DVDRAM GSA-4040B A300 /dev/sr0 > [4:0:0:0]diskATA ST3120022A 3.76 /dev/sdg > > > /dev/md0: > Version : 00.90.03 > Creation Time : Mon Jun 18 13:40:57 2007 > Raid Level : raid5 > Array Size : 107522304 (102.54 GiB 110.10 GB) > Used Dev Size : 35840768 (34.18 GiB 36.70 GB) >Raid Devices : 4 > Total Devices : 4 > Preferred Minor : 0 > Persistence : Superblock is persistent > > Update Time : Wed Jul 18 13:31:22 2007 > State : clean > Active Devices : 4 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 256K > >UUID : 51ad72a7:a4d20d15:0f3ea3a1:5ccb49a0 > Events : 0.2 > > Number Major Minor RaidDevice State >0 8 170 active sync /dev/sdb1 >1 8 331 active sync /dev/sdc1 >2 8 492 active sync /dev/sdd1 >3 8 653 active sync /dev/sde1 > > This is, four scsi disks on a Adaptec U320, doing raid5: > > /dev/sdb: > Timing cached reads: 904 MB in 2.00 seconds = 451.84 MB/sec > Timing buffered disk reads: 228 MB in 3.00 seconds = 75.90 MB/sec > /dev/sdc: > Timing buffered disk reads: 226 MB in 3.01 seconds = 75.01 MB/sec > /dev/sdd: > Timing buffered disk reads: 228 MB in 3.00 seconds = 75.88 MB/sec > /dev/sde: > Timing buffered disk reads: 226 MB in 3.00 seconds = 75.31 MB/sec > > /dev/md0: > Timing buffered disk reads: 562 MB in 3.01 seconds = 186.88 MB/sec > > Nearly 75x3 = 215 Mb/s. And this looks like a small regression, I remember > to have seen 200Mb on this setup on previous kernels. > Performance is like 186/215 = 86%. > > And /dev/md1, raid0 on 2 IDE disks: > > /dev/sdf: > Timing buffered disk reads: 148 MB in 3.02 seconds = 48.93 MB/sec > /dev/sdg: > Timing buffered disk reads: 124 MB in 3.00 seconds = 41.33 MB/sec > > /dev/md1: > Timing buffered disk reads: 204 MB in 3.01 seconds = 67.68 MB/sec > > Performance: 67 / 90 = 75%, more or less...not too good. > > Now that I read the hdparm man page, perhaps would be better to repeat > the tests with hdparm --direct. > > -- > J.A. Magallon \ Software is like sex: > \ It's better when it's free > Mandriva Linux release 2008.0 (Cooker) for i586 > Linux 2.6.21-jam12 (gcc 4.2.1 20070704 (4.2.1-3mdv2008.0)) SMP PREEMPT > 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 > > > > Thanks for your reply. Rui Santos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
koan wrote: > How did you create the ext3 filesystem? The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct option that should be passed trough to --stride is 64. Am I correct ? I've also tested ( after sending my first report ) with xfs. I've also increases readahead to 65535 on all HD's I've also increases the stripe_cache_size to 16384. I can now get ~100MB/sec... > > Did you use the appropriate --stride option as noted here: > http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) > - > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
On Wed, 18 Jul 2007 10:56:11 +0100, Rui Santos <[EMAIL PROTECTED]> wrote: > Hi, > > I'm getting a strange slow performance behavior on a recently installed > Server. Here are the details: > ... > > I can get a write throughput of 60 MB/sec on each HD by issuing the > command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / > 4 )); sync`' > ... > > The RAID device I'm testing on is /dev/md2. Now, by issuing the same > command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); > sync`' on the raid device mount point, I get the following speeds: > With stripe_cache_size at default '265': 51 MB/sec > With stripe_cache_size at '8192': 73 MB/sec > I know many people consider this stupid, but can you post some hdparm -tT data ? The culprit can be the filesystem+pagecache, the md driver or the disk driver, so I think trying just hdparm will show if the disk o md are going nuts... In my case, I have a box with 2 raids, one with SCSI disks and one with IDE ones. Some results: lsscsi: [0:0:0:0]diskIBM DDYS-T18350N S96H /dev/sda [2:0:0:0]diskSEAGATE ST336807LW 0C01 /dev/sdb [2:0:1:0]diskSEAGATE ST336807LW 0C01 /dev/sdc [2:0:2:0]diskSEAGATE ST336807LW 0C01 /dev/sdd [2:0:3:0]diskSEAGATE ST336807LW 0C01 /dev/sde [3:0:0:0]diskATA ST3120022A 3.06 /dev/sdf [3:0:1:0]cd/dvd HL-DT-ST DVDRAM GSA-4040B A300 /dev/sr0 [4:0:0:0]diskATA ST3120022A 3.76 /dev/sdg /dev/md0: Version : 00.90.03 Creation Time : Mon Jun 18 13:40:57 2007 Raid Level : raid5 Array Size : 107522304 (102.54 GiB 110.10 GB) Used Dev Size : 35840768 (34.18 GiB 36.70 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Jul 18 13:31:22 2007 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : 51ad72a7:a4d20d15:0f3ea3a1:5ccb49a0 Events : 0.2 Number Major Minor RaidDevice State 0 8 170 active sync /dev/sdb1 1 8 331 active sync /dev/sdc1 2 8 492 active sync /dev/sdd1 3 8 653 active sync /dev/sde1 This is, four scsi disks on a Adaptec U320, doing raid5: /dev/sdb: Timing cached reads: 904 MB in 2.00 seconds = 451.84 MB/sec Timing buffered disk reads: 228 MB in 3.00 seconds = 75.90 MB/sec /dev/sdc: Timing buffered disk reads: 226 MB in 3.01 seconds = 75.01 MB/sec /dev/sdd: Timing buffered disk reads: 228 MB in 3.00 seconds = 75.88 MB/sec /dev/sde: Timing buffered disk reads: 226 MB in 3.00 seconds = 75.31 MB/sec /dev/md0: Timing buffered disk reads: 562 MB in 3.01 seconds = 186.88 MB/sec Nearly 75x3 = 215 Mb/s. And this looks like a small regression, I remember to have seen 200Mb on this setup on previous kernels. Performance is like 186/215 = 86%. And /dev/md1, raid0 on 2 IDE disks: /dev/sdf: Timing buffered disk reads: 148 MB in 3.02 seconds = 48.93 MB/sec /dev/sdg: Timing buffered disk reads: 124 MB in 3.00 seconds = 41.33 MB/sec /dev/md1: Timing buffered disk reads: 204 MB in 3.01 seconds = 67.68 MB/sec Performance: 67 / 90 = 75%, more or less...not too good. Now that I read the hdparm man page, perhaps would be better to repeat the tests with hdparm --direct. -- J.A. Magallon \ Software is like sex: \ It's better when it's free Mandriva Linux release 2008.0 (Cooker) for i586 Linux 2.6.21-jam12 (gcc 4.2.1 20070704 (4.2.1-3mdv2008.0)) SMP PREEMPT 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Slow Soft-RAID 5 performance
How did you create the ext3 filesystem? Did you use the appropriate --stride option as noted here: http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
On Wed, 18 Jul 2007, Rui Santos wrote: Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: Server: Asus AS-TS500-E4A Board: Asus DSBV-D ( http://uk.asus.com/products.aspx?l1=9=39=299=0=1210=2 ) Hard Drives: 3x Seagate ST3400620AS ( http://www.seagate.com/ww/v/index.jsp?vgnextoid=8eff99f4fa74c010VgnVCM10dd04090aRCRD=en-US ) I'm using the AHCI driver, although with ata_piix, the behavior is the same. Here's some info about the AHCI controler: With three disks, if everything was perfect yeah 120MB/s writes. When I had started out with 4 raptors I was getting 164MB/s read and write. By default with no optimizations you will not get good speed. With no optimizations with 10 raptors I get 180-200MB/s, with optimizations, 464MB/s write and 622MB/s read. 1. Use XFS if you wan't speed. 2. Use 128k, 256k or 1MiB chunk size. 3. Use 8192k, 16384k stripe_cache_size. 4. Use 65536 readahead size. These are only some of the optimizations I use. Justin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Slow Soft-RAID 5 performance
Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: Server: Asus AS-TS500-E4A Board: Asus DSBV-D ( http://uk.asus.com/products.aspx?l1=9=39=299=0=1210=2 ) Hard Drives: 3x Seagate ST3400620AS ( http://www.seagate.com/ww/v/index.jsp?vgnextoid=8eff99f4fa74c010VgnVCM10dd04090aRCRD=en-US ) I'm using the AHCI driver, although with ata_piix, the behavior is the same. Here's some info about the AHCI controler: 00:1f.2 SATA controller: Intel Corporation 631xESB/632xESB SATA Storage Controller AHCI (rev 09) (prog-if 01 [AHCI 1.0]) Subsystem: ASUSTeK Computer Inc. Unknown device 81dc Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19 I/O ports at 18c0 [size=8] I/O ports at 1894 [size=4] I/O ports at 1898 [size=8] I/O ports at 1890 [size=4] I/O ports at 18a0 [size=32] Memory at c8000400 (32-bit, non-prefetchable) [size=1K] Capabilities: [70] Power Management version 2 Capabilities: [a8] #12 [0010] The Kernel boot log is attached as boot.msg I can get a write throughput of 60 MB/sec on each HD by issuing the command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' Until this point everything seems acceptable, IMHO. The problem starts when I test the software-raid on all three HD's. Configuration: output of 'sfdisk -l' Disk /dev/sda: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sda1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sda2 17 82 66 530145 fd Linux raid autodetect /dev/sda3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sda4 0 - 0 00 Empty Disk /dev/sdb: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sdb1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sdb2 17 82 66 530145 fd Linux raid autodetect /dev/sdb3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sdb4 0 - 0 00 Empty Disk /dev/sdc: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sdc1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sdc2 17 82 66 530145 fd Linux raid autodetect /dev/sdc3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sdc4 0 - 0 00 Empty Configuration: output of 'cat /proc/mdstat' Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [linear] md0 : active raid1 sda1[0] sdc1[2] sdb1[1] 136448 blocks [3/3] [UUU] md1 : active raid5 sda2[0] sdc2[2] sdb2[1] 1060096 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] md2 : active raid5 sdc3[2] sda3[0] sdb3[1] 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: The RAID device I'm testing on is /dev/md2. Now, by issuing the same command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' on the raid device mount point, I get the following speeds: With stripe_cache_size at default '265': 51 MB/sec With stripe_cache_size at '8192': 73 MB/sec Extra notes: - All HD have queue_depth at '31', with means NCQ is ON. If I disable NCQ by setting the value to '1' the write speed achieved is lower. - Although I started from a fresh openSUSE 10.2 installation, I'm now running a vanilla 2.6.22.1 kernel - Kernel is running with Generic-x86-64 - Soft-RAID bitmap is disabled. If Enable it, the performance takes a serious hit. - The processor is the Intel Xeon Dual Core 5060 Family 15 with Hypertheading activated. If it is deactivated, the performance on this specific subject is the same. - Filesystem is ext3 Final quote: Shouldn't I, at least, be able to get write speeds of 120MB/sec instead of the current 73MB/sec? Is this a Soft-RAID problem or could it be something else ? Or I'm just missing something ? Thanks for your time, Rui Santos Inspecting /boot/System.map-2.6.22.1-default Loaded 26530 symbols from /boot/System.map-2.6.22.1-default. Symbols match kernel version 2.6.22. No module symbols loaded - kernel modules not enabled. klogd 1.4.1, log source = ksyslog started. <5>Linux version 2.6.22.1-default ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Tue Jul 17 14:38:37 WEST 2007 <6>Command line: root=/dev/md2 vga=normal noresume splash=off showopts <6>BIOS-provided physical RAM map: <4> BIOS-e820: - 0009cc00 (usable) <4> BIOS-e820: 0009cc00 -
Slow Soft-RAID 5 performance
Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: Server: Asus AS-TS500-E4A Board: Asus DSBV-D ( http://uk.asus.com/products.aspx?l1=9l2=39l3=299l4=0model=1210modelmenu=2 ) Hard Drives: 3x Seagate ST3400620AS ( http://www.seagate.com/ww/v/index.jsp?vgnextoid=8eff99f4fa74c010VgnVCM10dd04090aRCRDlocale=en-US ) I'm using the AHCI driver, although with ata_piix, the behavior is the same. Here's some info about the AHCI controler: 00:1f.2 SATA controller: Intel Corporation 631xESB/632xESB SATA Storage Controller AHCI (rev 09) (prog-if 01 [AHCI 1.0]) Subsystem: ASUSTeK Computer Inc. Unknown device 81dc Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19 I/O ports at 18c0 [size=8] I/O ports at 1894 [size=4] I/O ports at 1898 [size=8] I/O ports at 1890 [size=4] I/O ports at 18a0 [size=32] Memory at c8000400 (32-bit, non-prefetchable) [size=1K] Capabilities: [70] Power Management version 2 Capabilities: [a8] #12 [0010] The Kernel boot log is attached as boot.msg I can get a write throughput of 60 MB/sec on each HD by issuing the command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' Until this point everything seems acceptable, IMHO. The problem starts when I test the software-raid on all three HD's. Configuration: output of 'sfdisk -l' Disk /dev/sda: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sda1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sda2 17 82 66 530145 fd Linux raid autodetect /dev/sda3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sda4 0 - 0 00 Empty Disk /dev/sdb: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sdb1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sdb2 17 82 66 530145 fd Linux raid autodetect /dev/sdb3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sdb4 0 - 0 00 Empty Disk /dev/sdc: 48641 cylinders, 255 heads, 63 sectors/track Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0 Device Boot Start End #cyls#blocks Id System /dev/sdc1 * 0+ 16 17-136521 fd Linux raid autodetect /dev/sdc2 17 82 66 530145 fd Linux raid autodetect /dev/sdc3 83 48640 48558 390042135 fd Linux raid autodetect /dev/sdc4 0 - 0 00 Empty Configuration: output of 'cat /proc/mdstat' Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [linear] md0 : active raid1 sda1[0] sdc1[2] sdb1[1] 136448 blocks [3/3] [UUU] md1 : active raid5 sda2[0] sdc2[2] sdb2[1] 1060096 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] md2 : active raid5 sdc3[2] sda3[0] sdb3[1] 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] unused devices: none The RAID device I'm testing on is /dev/md2. Now, by issuing the same command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' on the raid device mount point, I get the following speeds: With stripe_cache_size at default '265': 51 MB/sec With stripe_cache_size at '8192': 73 MB/sec Extra notes: - All HD have queue_depth at '31', with means NCQ is ON. If I disable NCQ by setting the value to '1' the write speed achieved is lower. - Although I started from a fresh openSUSE 10.2 installation, I'm now running a vanilla 2.6.22.1 kernel - Kernel is running with Generic-x86-64 - Soft-RAID bitmap is disabled. If Enable it, the performance takes a serious hit. - The processor is the Intel Xeon Dual Core 5060 Family 15 with Hypertheading activated. If it is deactivated, the performance on this specific subject is the same. - Filesystem is ext3 Final quote: Shouldn't I, at least, be able to get write speeds of 120MB/sec instead of the current 73MB/sec? Is this a Soft-RAID problem or could it be something else ? Or I'm just missing something ? Thanks for your time, Rui Santos Inspecting /boot/System.map-2.6.22.1-default Loaded 26530 symbols from /boot/System.map-2.6.22.1-default. Symbols match kernel version 2.6.22. No module symbols loaded - kernel modules not enabled. klogd 1.4.1, log source = ksyslog started. 5Linux version 2.6.22.1-default ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Tue Jul 17 14:38:37 WEST 2007 6Command line: root=/dev/md2 vga=normal noresume splash=off showopts 6BIOS-provided physical RAM map: 4 BIOS-e820: - 0009cc00 (usable) 4 BIOS-e820:
Re: Slow Soft-RAID 5 performance
On Wed, 18 Jul 2007, Rui Santos wrote: Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: Server: Asus AS-TS500-E4A Board: Asus DSBV-D ( http://uk.asus.com/products.aspx?l1=9l2=39l3=299l4=0model=1210modelmenu=2 ) Hard Drives: 3x Seagate ST3400620AS ( http://www.seagate.com/ww/v/index.jsp?vgnextoid=8eff99f4fa74c010VgnVCM10dd04090aRCRDlocale=en-US ) I'm using the AHCI driver, although with ata_piix, the behavior is the same. Here's some info about the AHCI controler: With three disks, if everything was perfect yeah 120MB/s writes. When I had started out with 4 raptors I was getting 164MB/s read and write. By default with no optimizations you will not get good speed. With no optimizations with 10 raptors I get 180-200MB/s, with optimizations, 464MB/s write and 622MB/s read. 1. Use XFS if you wan't speed. 2. Use 128k, 256k or 1MiB chunk size. 3. Use 8192k, 16384k stripe_cache_size. 4. Use 65536 readahead size. These are only some of the optimizations I use. Justin. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Slow Soft-RAID 5 performance
How did you create the ext3 filesystem? Did you use the appropriate --stride option as noted here: http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
On Wed, 18 Jul 2007 10:56:11 +0100, Rui Santos [EMAIL PROTECTED] wrote: Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: ... I can get a write throughput of 60 MB/sec on each HD by issuing the command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' ... The RAID device I'm testing on is /dev/md2. Now, by issuing the same command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' on the raid device mount point, I get the following speeds: With stripe_cache_size at default '265': 51 MB/sec With stripe_cache_size at '8192': 73 MB/sec I know many people consider this stupid, but can you post some hdparm -tT data ? The culprit can be the filesystem+pagecache, the md driver or the disk driver, so I think trying just hdparm will show if the disk o md are going nuts... In my case, I have a box with 2 raids, one with SCSI disks and one with IDE ones. Some results: lsscsi: [0:0:0:0]diskIBM DDYS-T18350N S96H /dev/sda [2:0:0:0]diskSEAGATE ST336807LW 0C01 /dev/sdb [2:0:1:0]diskSEAGATE ST336807LW 0C01 /dev/sdc [2:0:2:0]diskSEAGATE ST336807LW 0C01 /dev/sdd [2:0:3:0]diskSEAGATE ST336807LW 0C01 /dev/sde [3:0:0:0]diskATA ST3120022A 3.06 /dev/sdf [3:0:1:0]cd/dvd HL-DT-ST DVDRAM GSA-4040B A300 /dev/sr0 [4:0:0:0]diskATA ST3120022A 3.76 /dev/sdg /dev/md0: Version : 00.90.03 Creation Time : Mon Jun 18 13:40:57 2007 Raid Level : raid5 Array Size : 107522304 (102.54 GiB 110.10 GB) Used Dev Size : 35840768 (34.18 GiB 36.70 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Jul 18 13:31:22 2007 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : 51ad72a7:a4d20d15:0f3ea3a1:5ccb49a0 Events : 0.2 Number Major Minor RaidDevice State 0 8 170 active sync /dev/sdb1 1 8 331 active sync /dev/sdc1 2 8 492 active sync /dev/sdd1 3 8 653 active sync /dev/sde1 This is, four scsi disks on a Adaptec U320, doing raid5: /dev/sdb: Timing cached reads: 904 MB in 2.00 seconds = 451.84 MB/sec Timing buffered disk reads: 228 MB in 3.00 seconds = 75.90 MB/sec /dev/sdc: Timing buffered disk reads: 226 MB in 3.01 seconds = 75.01 MB/sec /dev/sdd: Timing buffered disk reads: 228 MB in 3.00 seconds = 75.88 MB/sec /dev/sde: Timing buffered disk reads: 226 MB in 3.00 seconds = 75.31 MB/sec /dev/md0: Timing buffered disk reads: 562 MB in 3.01 seconds = 186.88 MB/sec Nearly 75x3 = 215 Mb/s. And this looks like a small regression, I remember to have seen 200Mb on this setup on previous kernels. Performance is like 186/215 = 86%. And /dev/md1, raid0 on 2 IDE disks: /dev/sdf: Timing buffered disk reads: 148 MB in 3.02 seconds = 48.93 MB/sec /dev/sdg: Timing buffered disk reads: 124 MB in 3.00 seconds = 41.33 MB/sec /dev/md1: Timing buffered disk reads: 204 MB in 3.01 seconds = 67.68 MB/sec Performance: 67 / 90 = 75%, more or less...not too good. Now that I read the hdparm man page, perhaps would be better to repeat the tests with hdparm --direct. -- J.A. Magallon jamagallon()ono!com \ Software is like sex: \ It's better when it's free Mandriva Linux release 2008.0 (Cooker) for i586 Linux 2.6.21-jam12 (gcc 4.2.1 20070704 (4.2.1-3mdv2008.0)) SMP PREEMPT 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
koan wrote: How did you create the ext3 filesystem? The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct option that should be passed trough to --stride is 64. Am I correct ? I've also tested ( after sending my first report ) with xfs. I've also increases readahead to 65535 on all HD's I've also increases the stripe_cache_size to 16384. I can now get ~100MB/sec... Did you use the appropriate --stride option as noted here: http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
J.A. Magallón wrote: On Wed, 18 Jul 2007 10:56:11 +0100, Rui Santos [EMAIL PROTECTED] wrote: Hi, I'm getting a strange slow performance behavior on a recently installed Server. Here are the details: ... I can get a write throughput of 60 MB/sec on each HD by issuing the command 'time `dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' ... The RAID device I'm testing on is /dev/md2. Now, by issuing the same command 'dd if=/dev/zero of=test.raw bs=4k count=$(( 1024 * 1024 / 4 )); sync`' on the raid device mount point, I get the following speeds: With stripe_cache_size at default '265': 51 MB/sec With stripe_cache_size at '8192': 73 MB/sec I know many people consider this stupid, but can you post some hdparm -tT data ? Of course. Here's the output: NewServer-RD:~ # hdparm -tT /dev/md2 /dev/md2: Timing cached reads: 1738 MB in 2.00 seconds = 868.93 MB/sec Timing buffered disk reads: 444 MB in 3.01 seconds = 147.69 MB/sec NewServer-RD:~ # hdparm --direct -tT /dev/md2 /dev/md2: Timing O_DIRECT cached reads: 290 MB in 2.01 seconds = 144.05 MB/sec Timing O_DIRECT disk reads: 396 MB in 3.01 seconds = 131.75 MB/sec The culprit can be the filesystem+pagecache, the md driver or the disk driver, so I think trying just hdparm will show if the disk o md are going nuts... In my case, I have a box with 2 raids, one with SCSI disks and one with IDE ones. Some results: lsscsi: [0:0:0:0]diskIBM DDYS-T18350N S96H /dev/sda [2:0:0:0]diskSEAGATE ST336807LW 0C01 /dev/sdb [2:0:1:0]diskSEAGATE ST336807LW 0C01 /dev/sdc [2:0:2:0]diskSEAGATE ST336807LW 0C01 /dev/sdd [2:0:3:0]diskSEAGATE ST336807LW 0C01 /dev/sde [3:0:0:0]diskATA ST3120022A 3.06 /dev/sdf [3:0:1:0]cd/dvd HL-DT-ST DVDRAM GSA-4040B A300 /dev/sr0 [4:0:0:0]diskATA ST3120022A 3.76 /dev/sdg /dev/md0: Version : 00.90.03 Creation Time : Mon Jun 18 13:40:57 2007 Raid Level : raid5 Array Size : 107522304 (102.54 GiB 110.10 GB) Used Dev Size : 35840768 (34.18 GiB 36.70 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Jul 18 13:31:22 2007 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : 51ad72a7:a4d20d15:0f3ea3a1:5ccb49a0 Events : 0.2 Number Major Minor RaidDevice State 0 8 170 active sync /dev/sdb1 1 8 331 active sync /dev/sdc1 2 8 492 active sync /dev/sdd1 3 8 653 active sync /dev/sde1 This is, four scsi disks on a Adaptec U320, doing raid5: /dev/sdb: Timing cached reads: 904 MB in 2.00 seconds = 451.84 MB/sec Timing buffered disk reads: 228 MB in 3.00 seconds = 75.90 MB/sec /dev/sdc: Timing buffered disk reads: 226 MB in 3.01 seconds = 75.01 MB/sec /dev/sdd: Timing buffered disk reads: 228 MB in 3.00 seconds = 75.88 MB/sec /dev/sde: Timing buffered disk reads: 226 MB in 3.00 seconds = 75.31 MB/sec /dev/md0: Timing buffered disk reads: 562 MB in 3.01 seconds = 186.88 MB/sec Nearly 75x3 = 215 Mb/s. And this looks like a small regression, I remember to have seen 200Mb on this setup on previous kernels. Performance is like 186/215 = 86%. And /dev/md1, raid0 on 2 IDE disks: /dev/sdf: Timing buffered disk reads: 148 MB in 3.02 seconds = 48.93 MB/sec /dev/sdg: Timing buffered disk reads: 124 MB in 3.00 seconds = 41.33 MB/sec /dev/md1: Timing buffered disk reads: 204 MB in 3.01 seconds = 67.68 MB/sec Performance: 67 / 90 = 75%, more or less...not too good. Now that I read the hdparm man page, perhaps would be better to repeat the tests with hdparm --direct. -- J.A. Magallon jamagallon()ono!com \ Software is like sex: \ It's better when it's free Mandriva Linux release 2008.0 (Cooker) for i586 Linux 2.6.21-jam12 (gcc 4.2.1 20070704 (4.2.1-3mdv2008.0)) SMP PREEMPT 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 Thanks for your reply. Rui Santos - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow Soft-RAID 5 performance
Are you sure about that chunk size? In you initial posting you show /proc/mdstat reporting: md2 : active raid5 sdc3[2] sda3[0] sdb3[1] 780083968 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU] Which would seem to state a 128K chunk, and thus with a 4k block size you would need a stride of 32. On 7/18/07, Rui Santos [EMAIL PROTECTED] wrote: koan wrote: How did you create the ext3 filesystem? The chunk_size is at 256KB, ext3 block size is 4k. I believe the correct option that should be passed trough to --stride is 64. Am I correct ? I've also tested ( after sending my first report ) with xfs. I've also increases readahead to 65535 on all HD's I've also increases the stripe_cache_size to 16384. I can now get ~100MB/sec... Did you use the appropriate --stride option as noted here: http://tldp.org/HOWTO/Software-RAID-HOWTO-5.html (#5.11) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/