Tim Moore <[EMAIL PROTECTED]> writes with a number of great questions:
> > The trouble I'm having is that my RAID-1 read performance appears
> > substantially worse than just using the same disk partitions directly.
>
> How did you measure bonnie performance on the raw partitions?
Sorry, the way I worded that was confusing. I didn't use the raw
partitions; I just did a 'raidstop /dev/md0' and then used mke2fs
to put filesystems on the ex-RAID partitions and then mounted them
normally.
> The only significant thing I can see is block reads @75% for this
> particular sample.
Exactly what concerns me. Given that the theoretical performance of RAID-1
is 200% of one disk, 75% strikes me as a little low. (See the last set
of bonnie runs in the message for more evidence on this.) As I'd really
like to get this machine in to production, I may just settle for 75%,
but I am worried that this is a sign of something else being wrong. And
the extra performance would be swell, too.
> > SMP kernel (I only have one 450 MHz PII, but the board has a second
> > processor slot that I'll fill eventually). It has 128 MB RAM, and the
> > disks in question are identical quantum disks on the same NCR SCSI bus.
>
> Why the SMP kernel when there aren't 2 active CPUs.?
I plan to add one shortly. The docs claimed it wasn't a problem to run
with a CPU missing, so this seemed like the easiest solution.
> > mke2fs -b 1024 -i 1536 /dev/md0
>
> Why did you change bytes/inode to 1.5?
Because the average size of the files I need to put on this partition is
about 1600, and 1536 (1024+512) was nearby. With the default number of
inodes, I'd run out before all the content was on. With a more normal
number like 1024, then I would have too many inodes and not enough
disk space.
> Also read the /usr/doc info on calculating stride.
The version I have doesn't mention anything useful in connection with
RAID-1, only RAID-4/5, so I left it alone. I'd be glad to change this
to any reasonable number, though.
Tim also asked for a bunch of output, which I will provide here:
> Please post output from fdisk -l /dev/sda /dev/sdb.
Disk /dev/sda: 255 heads, 63 sectors, 1106 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 33 265041 83 Linux
/dev/sda2 34 1106 8618872+ 5 Extended
/dev/sda5 34 50 136521 82 Linux swap
/dev/sda6 51 66 128488+ 83 Linux
/dev/sda7 67 1106 8353768+ fd Unknown
Disk /dev/sdb: 255 heads, 63 sectors, 1106 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 48 385528+ 83 Linux
/dev/sdb2 50 1106 8490352+ 5 Extended
/dev/sdb3 49 49 8032+ 83 Linux
/dev/sdb5 50 66 136521 82 Linux swap
/dev/sdb6 67 1106 8353768+ fd Unknown
> hdparm -tT /dev/md0 /dev/sda7 /dev/sdb6 /dev/md0 /dev/sda7 /dev/sdb6
/dev/md0:
Timing buffer-cache reads: 64 MB in 0.55 seconds =116.36 MB/sec
Timing buffered disk reads: 32 MB in 2.88 seconds =11.11 MB/sec
/dev/sda7:
Timing buffer-cache reads: 64 MB in 0.49 seconds =130.61 MB/sec
Timing buffered disk reads: 32 MB in 2.49 seconds =12.85 MB/sec
/dev/sdb6:
Timing buffer-cache reads: 64 MB in 0.49 seconds =130.61 MB/sec
Timing buffered disk reads: 32 MB in 2.44 seconds =13.11 MB/sec
/dev/md0:
Timing buffer-cache reads: 64 MB in 0.57 seconds =112.28 MB/sec
Timing buffered disk reads: 32 MB in 2.84 seconds =11.27 MB/sec
/dev/sda7:
Timing buffer-cache reads: 64 MB in 0.50 seconds =128.00 MB/sec
Timing buffered disk reads: 32 MB in 2.42 seconds =13.22 MB/sec
/dev/sdb6:
Timing buffer-cache reads: 64 MB in 0.48 seconds =133.33 MB/sec
Timing buffered disk reads: 32 MB in 2.42 seconds =13.22 MB/sec
> [time the creation of 100 MB file]
Here it is on the raid filesystem:
[root@venus local]# time dd if=/dev/zero of=/usr/local/100MBtest bs=1k count=100k &&
time dd if=/usr/local/100MBtest of=/dev/null bs=1k && time rm -rf /usr/local/100MBtest
102400+0 records in
102400+0 records out
0.070u 1.860s 0:05.79 33.3% 0+0k 0+0io 97pf+0w
102400+0 records in
102400+0 records out
0.130u 2.020s 0:09.82 21.8% 0+0k 0+0io 25691pf+0w
0.000u 0.100s 0:01.59 6.2% 0+0k 0+0io 88pf+0w
And here it is on a regular filesystem:
[root@venus local]# time dd if=/dev/zero of=//100MBtest bs=1k count=100k && time dd
if=/100MBtest of=/dev/null bs=1k && time rm -rf /100MBtest
102400+0 records in
102400+0 records out
0.110u 1.700s 0:04.57 39.6% 0+0k 0+0io 91pf+0w
102400+0 records in
102400+0 records out
0.140u 1.840s 0:07.87 25.1% 0+0k 0+0io 25694pf+0w
0.000u 0.100s 0:02.11 4.7% 0+0k 0+0io 95pf+0w
> Please post averaged results from several bonnie runs of 3x main memory.
Sure thing. Here are the results from the 384 MB runs of bonnie on the
RAID-1 device:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
md0 384 5611 82.0 13586 20.5 4264 12.3 6032 86.2 8884 10.8 117.5 3.4
md0 384 5539 81.0 13568 21.0 4335 12.5 6038 86.2 8897 11.0 115.7 3.3
md0 384 5680 83.5 13584 20.6 4300 12.9 6025 86.6 8784 10.9 117.7 3.6
avg 384 5610 82.2 13579 20.7 4300 12.6 6032 86.3 8855 10.9 117.0 3.4
And here are the runs for the same disk partitions, reformatted and
remounted as /pa and /pb:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
pa 384 5991 85.3 13389 17.6 4305 11.2 6156 86.3 11954 12.5 100.8 2.4
pa 384 6022 85.8 13326 17.9 4282 11.2 5884 82.7 11914 12.4 100.7 2.6
pa 384 6079 87.0 13440 17.8 4312 11.1 5943 85.1 12123 14.0 101.2 2.3
pb 384 6327 91.6 13490 18.1 4346 11.5 6135 86.7 12367 12.9 102.1 2.8
pb 384 6124 88.5 13506 17.6 4351 11.5 6137 85.7 12410 12.8 103.4 2.7
average 384 6109 87.6 13430 17.8 4319 11.3 6051 85.3 12154 12.9 102.0 2.6
I also ran another test, which was interesting. I ran two copies of bonnie
in parallel, one each on /pa and /pb, the same partitions that are used
when I do the mkraid. Here are those performance numbers. Both bonnies
started at the same time and finished within a second of each other:
-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
pa 384 3312 47.9 10074 14.6 4463 12.6 3494 49.4 12235 13.2 77.6 2.2
pb 384 3332 47.9 9320 12.9 4411 11.7 3468 48.9 11991 13.4 83.5 1.8
sum 768 6644 95.8 19394 27.5 8874 24.3 6962 98.3 24226 26.6 161.1 4.0
This pretty clearly suggests that the hardware is capable of a lot more
than the RAID-1 is actually doing. I'd expect that RAID-1 block writes
would be as slow as 10 MB/s (although they seem faster, perhaps because
of the cache). But reads should at least be in the ballpark of 20-24 MB/s,
shouldn't they? rather than the 9 MB/s I'm getting?
Thanks again to everyone for their help! Please also note that I'd be
content with the answer "current implementation is slow". I know I'm
working with prerelease software here, so I'm grateful for what I
can get.
William