On Wednesday June 28, [EMAIL PROTECTED] wrote:
>
> Greetings,
>
> I'm new to the list, but not a new user to /dev/md. I was hoping someone here
> could tell me about their experiences with Software RAID-5 and IDE, as I
> seem to be having some issues.
>
> I just built a new system to play with (eventually, I'd like to use it as
> a large file store for my mp3 collection ;) that consists of: SuperMicro
> dual slot-1 (P6DBE), two P3/550/512/100's (Katami), with 256mb SDRAM/PC100.
>
> The system has a single 20.0 gig (WDC) ide boot drive, nothing else hooked
> to the built-in motherboard IDE ports, in fact, the secondary IDE port is
> disabled in BIOS.
>
> I have two Promise Ultra/66 PCI controllers, and I'm using linux-2.4.0-test2-ac2.
>Connected to each Promise card, is a 40.0 GIG Maxtor ATA/66 7200 RPM drive. (4 total,
>each on a dedicated IDE bus)
>
> During the sync phase of RAID-5, initially, the rate starts out very fast. It
>hovers around 18,000k/sec.
>
> A few minutes in, however, the rate drops to 5000k/sec for the duration of
> the resync, which is even slower than the resync rate of another box I have,
> which is a single-cpu PPRO/200 using 4 18gig SCSI drives and an
> Adaptec 2940UW.
The way raid5 resync works in 2.4 is that it reads a whole stripe,
checks the parity and if it is wrong, re-writes the parity block.
This means that if the parity is actually correct, then all the drives
will be reading sequentially as fast as the system allows.
If there are any parity errors, the disk which stores the parity
block will have to stop streaming forward, backup up a bit, write out
the block, and the start reading again.
I there are lots of parity errors, you will get lots of
backward/forward seeking on all drives, and the throughput will drop
markedly.
What you probably have is that the first part of the array has correct
parity, and this is checked at 18M/sec. After that, there are parity
errors and the whole process slows down.
So, resync is actually a very inefficient way to build a a new RAID5
array. A much better approach is reconstruction.
Build your array with a failed non-existant drive and a spare, and let
it rebuild onto the spare.
Something like:
raiddev /dev/md0
raid-level 5
nr-raid-disks 4
nr-spare-disks 1
chunk-size 128
parity-algorithm left-symmetric
device /dev/hda1
raid-disk 0
device /dev/hdb1
raid-disk 1
device /dev/hdc1
raid-disk 2
device /dev/hddoesntexist
failed-disk 3
device /dev/hdd1
spare-disk 4
(/dev/hddoesntexist might actually need to be a name that appears in
/dev, but it doesn't need to refer to an existing drive)
If you build your array like this, it will create it degraded and
immediately start recovery onto the spare.
The recovery process reads the 3 good disks sequentially as fast as
possible, and writes the 1 bad disk slightly later, but it writes
sequentially so it goes at top speed to.
(here, "top speed" depends a bit on the speed of the drive, the speed
of the buss, the number of devices on the buss, and the speed of the
controller. In your setup, I suspect it should be able to sustain
18,000k/sec).
NeilBrown
>
> Watching the output of 'vmstat 1', 'bo' increases significantly just as the
> resync rate starts to fall, which would explain it. But this doesnt seem to
> happen on my SCSI box, it remains a consistent rate all the way through.
>
> HERE
> ----
> 0 0 1 0 13212 196768 7840 0 0 18240 0 1248 1892 0 53 47
> 0 0 1 0 13208 196768 7844 0 0 18365 0 1251 2097 0 51 49
> 0 0 1 0 13208 196768 7844 0 0 18583 0 1302 2007 0 54 46
> 0 0 1 0 13208 196768 7844 0 0 18292 0 1239 1952 0 47 53
> 0 0 1 0 13208 196768 7844 0 0 18828 0 1271 2038 0 53 47
> 0 0 1 0 13208 196768 7844 0 0 14848 640 1270 2140 0 38 62
> 0 0 1 0 13208 196768 7844 0 0 6464 1441 1174 2152 0 11 88
> 0 0 1 0 13208 196768 7844 0 0 5360 1344 1056 1973 0 7 93
> 0 0 1 0 13208 196768 7844 0 0 5136 1280 1020 1826 0 6 94
> 0 0 1 0 13208 196768 7844 0 0 5344 1344 1053 1892 0 9 91
>
> Anyone else notice this?
>
> -George Shearer ([EMAIL PROTECTED])