prad wrote:

4.1 with 4 18G drives one thought is to do a raid1, but we really
don't  want 3 identical copies. is the only way to have 2 36G mirrors,
by using raid0+1 or raid1+0?

raid10 strongly preferred -- ie. you make a series of raid1 pairs and
then stripe across them.  This is high performance and resilient to
disk failures -- it can conceivably survive loss of half your drives so long as it's only one from each RAID1 pair.

4.2 another possibility is to do raid0, but is that ever wise unless
you desperately need the space since in our situation you run a 1/4
chance of going down completely?

Anything that involves raid0 over several raw drives is going to be
an Achillies heel.  Loss of any one disk out of a raid0 disables the
whole stripe.

4.3 is striping or mirroring faster as far as i/o goes (or does the
difference really matter)? i would have thought the former, but the
handbook says "Striping requires somewhat more effort to locate the
data, and it can cause additional I/O load where a transfer is spread
over multiple disks" #20.3

Mirroring tends to make reads a bit faster (because there are two disks
to spread the IO between) and writes slightly slower (because the write
has to hit both platters).  On the whole, however, the performance
difference between a mirrored pair and a single drive is probably not noticeable[*].

Striping across drives /generally/ gives you a big performance boost --
it depends really on your traffic patterns.  If you're doing lots of
small parallel IOs randomly distributed across the whole filesystem then striping is a really good choice. (Most RDBMses produce this sort of pattern, and so may things like web or mail servers.) If you're streaming
large quantities of data sequentially into or out of a file, then striping
isn't bad, but you might find it worthwhile to consider more space efficient geometries like RAID5[+], where this traffic pattern minimises
the overhead of the extra processing involved.

The big deal with any sort of RAID is not how long it takes to work out
which disk the data is on or anything like that.  That's an operation
that completes on the time scale of CPU events: ie nanoseconds.  The
big stumbling block is always waiting for the disk to rotate, an operation which occurs on the timescale of milliseconds. The more spindles your IO request can be spread over the more that delay can be parallelized between them and the faster the ultimate result.

        Cheers,

        Matthew

[*] Unless you adopt a highly sub-optimal configuration like mirroring
the master with the slave on the same IDE bus.
[+] but don't expect any sort of sparkling performance out of RAID5
unless you have a decent hardware controller card with plenty of cache
RAM.

--
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                 Kent, CT11 9PW

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to