prad wrote:
4.1 with 4 18G drives one thought is to do a raid1, but we really don't want 3 identical copies. is the only way to have 2 36G mirrors, by using raid0+1 or raid1+0?
raid10 strongly preferred -- ie. you make a series of raid1 pairs and then stripe across them. This is high performance and resilient todisk failures -- it can conceivably survive loss of half your drives so long as it's only one from each RAID1 pair.
4.2 another possibility is to do raid0, but is that ever wise unless you desperately need the space since in our situation you run a 1/4 chance of going down completely?
Anything that involves raid0 over several raw drives is going to be an Achillies heel. Loss of any one disk out of a raid0 disables the whole stripe.
4.3 is striping or mirroring faster as far as i/o goes (or does the difference really matter)? i would have thought the former, but the handbook says "Striping requires somewhat more effort to locate the data, and it can cause additional I/O load where a transfer is spreadover multiple disks" #20.3
Mirroring tends to make reads a bit faster (because there are two disks to spread the IO between) and writes slightly slower (because the write has to hit both platters). On the whole, however, the performancedifference between a mirrored pair and a single drive is probably not noticeable[*].
Striping across drives /generally/ gives you a big performance boost -- it depends really on your traffic patterns. If you're doing lots ofsmall parallel IOs randomly distributed across the whole filesystem then striping is a really good choice. (Most RDBMses produce this sort of pattern, and so may things like web or mail servers.) If you're streaming
large quantities of data sequentially into or out of a file, then stripingisn't bad, but you might find it worthwhile to consider more space efficient geometries like RAID5[+], where this traffic pattern minimises
the overhead of the extra processing involved. The big deal with any sort of RAID is not how long it takes to work out which disk the data is on or anything like that. That's an operation that completes on the time scale of CPU events: ie nanoseconds. Thebig stumbling block is always waiting for the disk to rotate, an operation which occurs on the timescale of milliseconds. The more spindles your IO request can be spread over the more that delay can be parallelized between them and the faster the ultimate result.
Cheers, Matthew [*] Unless you adopt a highly sub-optimal configuration like mirroringthe master with the slave on the same IDE bus.
[+] but don't expect any sort of sparkling performance out of RAID5 unless you have a decent hardware controller card with plenty of cache RAM. -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW
signature.asc
Description: OpenPGP digital signature