Re: speed and scaling
Seth Vidal wrote: Hi folks, I have an odd question. Where I work we will, in the next year, be in a position to have to process about a terabyte or more of data. The data is probably going to be shipped on tapes to us but then it needs to be read from disks and analyzed. The process is segmentable so its reasonable to be able to break it down into 2-4 sections for processing so arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Something to think about regarding IBM 10K drives: http://www8.zdnet.com/eweek/stories/general/0,11011,2573067,00.html Adaptec's best 160 controller that is supported by linux. The data does not have to be redundant or stable - since it can be restored from tape at almost any time. so I'd like to put this in a software raid 0 array for the speed. So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? What controllers/drives should I be looking at? And has anyone worked with gigabit connections to an array of this size for nfs access? What sort of speeds can I optimally (figuring nfsv3 in async mode from the 2.2 patches or 2.4 kernels) expect to achieve for network access. thanks -sv Lots of discussion on this already, so I will just touch on a few points. Jon Lewis mentioned that you should place a value on how long it takes to read in your data sets when considering the value of RAID. Unfortunately, RAID write throughput can be relatively slow for HW RAID compared to SW RAID. I've appended some numbers for reference. Also, the process of reading in from tape will double the load on a single PCI bus. I think you will be happier with a dual PCI bus MB. One of the the eternal unknowns is how well a particular Intel (or clone) shipset will work for a particular I/O load. Alpha and Sparc MB are designed with I/O as a higher priority goal than Intel MBs. Intel is getting better at this, but contention between the PCI busses for memory access can be a problem as well. Brian Pomerantz at LLNL has gotten more than 90MB/s streaming to a Ciprico RAID system, but he went to a fair amount of work to get there i.e. 2MB block sizes. You probably want to talk to him and you should be able to find post from him in the raid archives. Hardware: dual PIII 600Mhz/Lancewood/128MB and 1GB Mylex: 150, 1100, 352 Mylex cache: writethru Disks: 5 Atlas V Some caveats: 1. The focus is on sequential performance. YMMV 2. The write number for SW RAID5 is surprisingly good. It either indicates excellent cache managment and reuse of parity blocks or some understanding of the sequential nature of the bonnie benchmark. A RAID5 update should be approximately 25% of raw write performance with no caching assistance. 3. I am a little bothered by the very strong correlation between CPU% and MB/s for all of the Mylex controllers for the bonnie tests. I guess that is the I/O service overhead, but it still seems high ot me. 4. The HW RAID numbers are for 5 drives. The SW RAID numbers are for 8 drives. 5. Effect of CPU memory on bonnie (AcceleRAID 150) read performance is 15-20%. See below: AcceleRAID 150 DRAM=256MB ReadWrite BW MB/s CPU BW CPU RAID3 42.334% 4.6 3% RAID5 43.038% 4.5 3% RAID6(0+1) 37.533% 12.711% DRAM=1GB ReadWrite BW MB/s CPU BW CPU RAID3 48.450% 4.6 3% RAID5 49.151% 4.5 3% RAID6(0+1) 45.239% 12.710% 6. The Mylex eXtremeRAID1100 does not show much difference in write performance between RAID5 and RAID6. See below: - ExtremeRAID 1100 1GB ReadWrite BW MB/s CPU BW CPU RAID3 48.350% 14.713% RAID5 52.755% 15.113% RAID6(0+1) 48.149% 14.7* 13% RAID0 56.360% 40.837% * This should be better - AcceleRAID 352 1GB ReadWrite BW MB/s CPU BW CPU RAID3 45.244% 6.8 5% RAID5 46.245% 6.6 5% RAID6(0+1) 39.639% 16.7* 14% RAID0 50.550% 36.730% * This is better After talking to Dan Jones and the figures he was getting on Mylex cards, I decided to do some simple software raid benchmarks. Hardware: dual PIII 500Mhz/Nightshade/128MB SCSI: NCR 53c895 (ultra 2 lvd 80MB/s) Disks: 8 18G
Re: speed and scaling
Quoted mail: Dan Hollis wrote: ... some parts are cutted . I never denied that such beasts exist. I just wanted to point out that a x86 machine with those mobos would come close in price to the alpha solution. I simply can't imagine that there are no alpha boxen with more than 2 PCI busses. If I had a faster internet connection now, I'd check the web site of alpha-processor Inc. I use a Siemens D111 board/chipset 450NX. It has 3 PCI buses. It is a PIII/Xeon board, not very cheap but with one 64 bit PCI bus which is done by coupleling two 32 bit PCI uses. cu Jens Marc -- Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/ University of Bielefeld, Dep. of Mathematics / Dep. of Physics PGP-keyID's: 0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH) ---End quote -- Jens Klaas NEC Europe Ltd. CC Research Laboratories Rathausallee 10 D-53757 Sankt Augustin Phone: 02241/9252-0 02241/9252-72 Fax: 02241/9252-99 eMail: [EMAIL PROTECTED] www.ccrl-nece.de/klaas -- In sharks we trust. --
Re: speed and scaling
Dan Hollis wrote: On Sat, 15 Jul 2000, Marc Mutz wrote: Look, you are an the _very_ wrong track! You may have 6 or 7 PCI _slots_, but you have only _one_ bus, i.e. only 133MB/sec bandwidth for _all_ 6 or 7 devices. You will not get 90MB/sec real throughput with a bus bandwidth of 133MB/sec! And the x86 architecture's memory bandwidth is _tiny_ (BX chipset does one or two _dozen_ MB/sec random access, ie. 12-24 MB/sec). No. BX does 180mbyte/sec (measured). K7 with Via KX133 does 262mbyte/sec (measured). Read more carefully. I said _random_ access, not sequential. I'd like to get numbers for real alphas. The only alpha I was able to measure was Alphastation 200 4/233. A measly 71mbyte/sec on that piece of shit. How old is that "shit" and what were the numbers then on x86? The alphas we have here have the same number of slots. But not only one bus. They typically have 3 slots/bus. There are multiple pci bus x86 motherboards. Generally found on systems with 6 slots. I have seen x86 motherboards with 3 PCI buses, interrupted I'd like to see how the x86 memory subsystem can saturate three (or only two) 533MB/sec 64/66 PCI busses and still have the bandwidth to compute a 90MB/sec stream of data. but the most ive seen on alpha or sparc is 2. -Dan I never denied that such beasts exist. I just wanted to point out that a x86 machine with those mobos would come close in price to the alpha solution. I simply can't imagine that there are no alpha boxen with more than 2 PCI busses. If I had a faster internet connection now, I'd check the web site of alpha-processor Inc. Marc -- Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/ University of Bielefeld, Dep. of Mathematics / Dep. of Physics PGP-keyID's: 0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)
RE: speed and scaling
Enough with the vulgarities. This doesn't really belong on the RAID list any longer, but I'll make a few points below. -Original Message- From: Marc Mutz [mailto:[EMAIL PROTECTED]] The alphas we have here have the same number of slots. But not only one bus. They typically have 3 slots/bus. There are multiple pci bus x86 motherboards. Generally found on systems with 6 slots. I have seen x86 motherboards with 3 PCI buses, interrupted I'd like to see how the x86 memory subsystem can saturate three (or only two) 533MB/sec 64/66 PCI busses and still have the bandwidth to compute a 90MB/sec stream of data. A P-III memory subsystem is capable of probably 800MB/sec, I doubt that it can handle more than that. Alphas and SPARCs have more than that, but you pay through the nose for it. It's also worth noting that x86 shares memory bandwidth when you do SMP (800MB/sec between two processors), where the EV6 Alphas have a switched memory bus. I haven't investigated that beyond reading a couple of papers, but you can find more of that on Compaq's website. but the most ive seen on alpha or sparc is 2. I never denied that such beasts exist. I just wanted to point out that a x86 machine with those mobos would come close in price to the alpha solution. I simply can't imagine that there are no alpha boxen with more than 2 PCI busses. If I had a faster internet connection now, I'd check the web site of alpha-processor Inc. http://www.compaq.com/alphaserver/gs80/index.html This isn't even the top of the line Alpha, but it has 16 (that's F in hex, or 20 in octal) PCI busses. Again, you PAY for that kind of machine. I'm sure that the top of the Alpha line must be close to as expensive as a Sun Ultra Enterprise 1, which goes for about seven figures. Now that we know that you can get bigger machines out of Alpha/SPARC than you can out of "stock" x86 machines, and that you have to pay for that kind of performance, can we get back to talking about RAID on Linux please? Grego
RE: speed and scaling
On Tue, 18 Jul 2000, Gregory Leblanc wrote: http://www.compaq.com/alphaserver/gs80/index.html This isn't even the top of the line Alpha, but it has 16 (that's F in hex, or 20 in octal) PCI busses. Again, you PAY for that kind of machine. That machine costs around US$2million. -Dan
Re: speed and scaling
Seth Vidal wrote: I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved memory access, to improve memory bandwidth. It costs around $1 with 512MB of RAM, see SWT (or STW) or Microway. This cost is small compared to the disks. The alpha comes with other headaches I'd rather not involve myself with - in addition the costs of the disks is trivial - 7 75gig scsi's @$1k each is only $7k - and the machine housing the machines also needs to be one which will do some of the processing - and all of their code is X86 - so I'm hesistant to suggest alphas for this. Look at the reality! If you have to do this sort of thing, x86 will give you headaches. Normal 133MhZx32Bit PCI is _way_ too slow for that machine. The entire PCI bus cannot saturate _one_ Ultra160 SCSI controller, let alone a GigEth card. Putting More than one into a box and trying to use them concurrently will show you what good normal PCI is for _really_ fast hardware. You sure want multiple PCI64x66Mhz busses and _now_ look again at board prices for x86. Also, if you do data analysis like setiathome does (i.e. mostly FP), alphas blow away _any_ other microprocessor (setiathome work-unit in less than an hour; my AMD K6-2 500 needs 18hrs!) Code can be re-compiled and probably should. Another advantage of the alpha is that you have more PCI slots. I'd put 3 disks on each card, and use about 4 of them per machine. This should be enough to get you 500GB. More how - the current boards I'm working with have 6-7 pci slots - no ISA's at all. Look, you are an the _very_ wrong track! You may have 6 or 7 PCI _slots_, but you have only _one_ bus, i.e. only 133MB/sec bandwidth for _all_ 6 or 7 devices. You will not get 90MB/sec real throughput with a bus bandwidth of 133MB/sec! And the x86 architecture's memory bandwidth is _tiny_ (BX chipset does one or two _dozen_ MB/sec random access, ie. 12-24 MB/sec). The alphas we have here have the same number of slots. But not only one bus. They typically have 3 slots/bus. Might I also suggest a good UPS system? :-) Ah, and a journaling FS... the ups is a must -the journaling filesystem is at issue too - In an ideal world there will be a Journaling File system that works correctly with sw raid :) -sv -- Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/ University of Bielefeld, Dep. of Mathematics / Dep. of Physics PGP-keyID's: 0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)
Re: speed and scaling
Boy I don't know who was screaming about PCI bandwidth, but : 1) I was the first person to mention it, weeks ago, so its no news flash, and you're no genius for thinking of it. 2) Stop screaming. 3) Be informed. P.S. Let us know what happens, Seth, this sounds like a cool project! Ed
Re: speed and scaling
Seth Vidal writes: I have an odd question. Where I work we will, in the next year, be in a position to have to process about a terabyte or more of data. The data is probably going to be shipped on tapes to us but then it needs to be read from disks and analyzed. The process is segmentable so its reasonable to be able to break it down into 2-4 sections for processing so arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Adaptec's best 160 controller that is supported by linux. The data does not have to be redundant or stable - since it can be restored from tape at almost any time. so I'd like to put this in a software raid 0 array for the speed. So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? What controllers/drives should I be looking at? Here are actual benchmarks from one of my systems. dbench: 2 Throughput 123.637 MB/sec (NB=154.546 MB/sec 1236.37 MBit/sec) 4 Throughput 109.7 MB/sec (NB=137.126 MB/sec 1097 MBit/sec) 32 Throughput 77.7743 MB/sec (NB=97.2178 MB/sec 777.743 MBit/sec) 64 Throughput 64.3793 MB/sec (NB=80.4741 MB/sec 643.793 MBit/sec) Bonnie: ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 2000 9585 99.1 51312 26.0 28675 45.3 9224 94.8 81720 73.3 512.2 4.2 That is with a Dell 4400, 2 x 600 MHz Pentium III Coppermine CPUs with 256K cache, 1GB RAM, one 64-bit 66MHz PCI bus (and one 33MHz PCI bus). Disk subsystem is a built-in Adaptec 7899 dual-160MB channel with 8 Quantum ATLAS IV 9MB SCA disks attached to one channel. The benchmarks above were done on an ext2 filesystem with 4KB blocksize and stride 16 created on a 7-way stripe of the above disks using software RAID (0.9 on kernel 2.2.x) with 64KB chunksize. If you use both SCSI channels and use 36GB disks (let alone 75GB ones), you'll get 480GB of disk without even needing to plug another SCSI card in. With larger disks or another SCSI card or two you could go larger/faster. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Unix Systems Programmer Oxford University Computing Services
Re: speed and scaling
Hi! or just brilliant driver design by Leonard Zubkof, but the Mylex cards are the performance king for hardware RAID under Linux (and I was suprised to hear that. We just bought a Mylex AcceleRAID 250 with five 18G IBM disks, and it's a lot slower than sw RAID. HW RAID 5 read: 26MB/s write: 9MB/s SW RAID 5 read: 43MB/s write: 39MB/s SW RAID 0 read: 62MB/s write: 65MB/s Tamas
Re: speed and scaling
From [EMAIL PROTECTED] Tue Jul 11 05:21:43 2000 Hi! or just brilliant driver design by Leonard Zubkof, but the Mylex cards are the performance king for hardware RAID under Linux (and I was suprised to hear that. We just bought a Mylex AcceleRAID 250 with five 18G IBM disks, and it's a lot slower than sw RAID. HW RAID 5 read: 26MB/s write: 9MB/s SW RAID 5 read: 43MB/s write: 39MB/s SW RAID 0 read: 62MB/s write: 65MB/s The 250 is their lowest end card. The ExtremeRAID cards have much faster processors and more memory. However, software RAID is almost always faster. I believe I mentioned that in my original note. Cheers, Chris -- Christopher Mauritz [EMAIL PROTECTED]
Re: speed and scaling
On Tue, 11 Jul 2000, [ISO-8859-2] Krisztián Tamás wrote: or just brilliant driver design by Leonard Zubkof, but the Mylex cards are the performance king for hardware RAID under Linux (and I was suprised to hear that. We just bought a Mylex AcceleRAID 250 with five 18G IBM disks, and it's a lot slower than sw RAID. HW RAID 5 read: 26MB/s write: 9MB/s SW RAID 5 read: 43MB/s write: 39MB/s That's about the same I get with an AcceleRAID 150 with 16mb and 5 IBM 9GB 7200rpm drives doing RAID5. The RAID controller CPU's just don't seem to keep up with system processor speedwise. Of course, these are Mylex's low-end entry level RAID cards. I want to try out an ExtremeRAID and see if write speed gets much better. -- Jon Lewis *[EMAIL PROTECTED]*| I route System Administrator| therefore you are Atlantic Net| _ http://www.lewis.org/~jlewis/pgp for PGP public key_
Re: speed and scaling
Seth Vidal wrote: So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? What controllers/drives should I be looking at? Im a big IDE fan, and have experimented with raid0 a fair bit, i dreamt of achieving theses speeds with 5 udma66 drives with multiple controllers. My experience is that IDE performance is seriously restricted in linux and definetaly doesnt scale linearly. I have had good experiences with promise controllers, but will never again try and use multiple HPT cards. Under 2.2.x a IDE raid0 you can get around 50 MB/s max, it effectivly doesnt scale past 3 drives. Under 2.4.x-test you can get a bit more than the speed of a single drive, its just f*cked, it has been a known problem for many months, but it doesnt look like anything much is going to happen to fix it. I suspect performance problems maybe tied up with the whole latency thing, or the elevator code, but thats really just me guessing. As far as i can tell IDE raid performance is limited by the IDE code not the raid code. I guess the good news is that SCSI performance is said to scale linearly, and sounds like in your situation you would have a few $ available to spend on SCSI hardware. Im sure a few people here would be interested in seeing some benchmarks when you do get soemthing happening. Good luck. Glenn
Re: speed and scaling
arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ Is this vastly read-only or will write speed also be a factor? -HJC
Re: speed and scaling
Seth Vidal wrote: [monster data set description snipped] So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Adaptec's best 160 controller that is supported by linux. [snip] So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? While you might get this from your controller data bus, I'm skeptical of moving this much data consistently across the PCI bus. I think it has a maximum of 133 MB/sec bandwidth (33 MHz * 32 bits width). Especially if (below) you have some network access going on, at near gigabit speeds.. you're just pushing lots of data. What controllers/drives should I be looking at? See if there is some sort of system you can get with multiple PCI busses, bridged or whatnot. And has anyone worked with gigabit connections to an array of this size for nfs access? What sort of speeds can I optimally (figuring nfsv3 in async mode from the 2.2 patches or 2.4 kernels) expect to achieve for network access. I've found vanilla nfs performance to be crummy, but haven't played with it at all. Ed -- Edward Schernau,mailto:[EMAIL PROTECTED] Network Architect http://www.schernau.com RC5-64#: 243249 e-gold acct #:131897
RE: speed and scaling
-Original Message- From: Seth Vidal [mailto:[EMAIL PROTECTED]] Sent: Monday, July 10, 2000 12:23 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: speed and scaling So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Adaptec's best 160 controller that is supported by linux. The data does not have to be redundant or stable - since it can be restored from tape at almost any time. so I'd like to put this in a software raid 0 array for the speed. So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? Assuming sequential reads, you should be able to get this from good drives. What controllers/drives should I be looking at? I'm not familiar with current top-end drives, but you should be looking for at least 4MB of cache on the drives. I think the best drives that you'll find will be able to deliver 20MB/sec without trouble, possibly a bit more. I seem to remember somebody on this liking Adaptec cards, but nobody on the SPARC lists will touch the things. I might look at a Tekram, or a Symbios based card, I've heard good things about them, and they're used on some of the bigger machines that I've worked with. Later, Grego
Re: speed and scaling
If you can afford it and this is for real work, you may want to consider something like a Network Appliance Filer. It will be a lot more robust and quite a bit faster than rolling your own array. The downside is they are quite expensive. I believe the folks at Raidzone make a "poor man's" canned array that can stuff almost a terabyte in one box and uses cheaper IDE disks. If you can't afford either of these solutions, 73gig Seagate Cheetahs are becoming affordable. Packing one of those rackmount 8 bay enclosures with these gets you over 500gb of storage if you just want to stripe them together. That would likely be VERY fast for reads/writes. The risk is that you'd lose everything if one of the disks crashed. Cheers, Chris From [EMAIL PROTECTED] Mon Jul 10 16:46:37 2000 Sounds like fun. Check out VA Linux's dual CPU boxes. They also offer fast LVD SCSI drives which can be raided together. I've got one dual P3-700 w/ dual 10k LVD drives. FAST! I'd suggest staying away from NFS for performance reasons. I think there is a better replacement out there ('coda' or something?). NFS will work, but I don't think it's what you want. You could also try connecting the machines through SCSI if you want to share files quickly (I haven't done this, but have heard of it). Good luck! Phil On Mon, Jul 10, 2000 at 03:22:46PM -0400, Seth Vidal wrote: Hi folks, I have an odd question. Where I work we will, in the next year, be in a position to have to process about a terabyte or more of data. The data is probably going to be shipped on tapes to us but then it needs to be read from disks and analyzed. The process is segmentable so its reasonable to be able to break it down into 2-4 sections for processing so arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ [...] -- Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR [EMAIL PROTECTED] -- http://www.netroedge.com/~phil PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A -- Christopher Mauritz [EMAIL PROTECTED]
Re: speed and scaling
arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ Is this vastly read-only or will write speed also be a factor? mostly read-only. -sv
RE: speed and scaling
i have not used adaptec 160 cards, but i have found most everything else they make to be very finicky about cabling and termination, and have had hard drives give trouble on adaptec that worked fine on other cards. my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and symbios themselves have a nice 64 bit wide, dual channel pci scsi card. which does lead to the point about pci. even _IF_ you could get the entire pci bus to do your disk transfers, you will find that you would still need more bandwidth for stuff like using your nics. so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit pci, or both. perhaps alpha? allan Gregory Leblanc [EMAIL PROTECTED] said: -Original Message- From: Seth Vidal [mailto:[EMAIL PROTECTED]] Sent: Monday, July 10, 2000 12:23 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: speed and scaling So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Adaptec's best 160 controller that is supported by linux. The data does not have to be redundant or stable - since it can be restored from tape at almost any time. so I'd like to put this in a software raid 0 array for the speed. So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? Assuming sequential reads, you should be able to get this from good drives. What controllers/drives should I be looking at? I'm not familiar with current top-end drives, but you should be looking for at least 4MB of cache on the drives. I think the best drives that you'll find will be able to deliver 20MB/sec without trouble, possibly a bit more. I seem to remember somebody on this liking Adaptec cards, but nobody on the SPARC lists will touch the things. I might look at a Tekram, or a Symbios based card, I've heard good things about them, and they're used on some of the bigger machines that I've worked with. Later, Grego
Re: speed and scaling
If you can afford it and this is for real work, you may want to consider something like a Network Appliance Filer. It will be a lot more robust and quite a bit faster than rolling your own array. The downside is they are quite expensive. I believe the folks at Raidzone make a "poor man's" canned array that can stuff almost a terabyte in one box and uses cheaper IDE disks. I priced the netapps - they are ridiculously expensive. They estimated 1tb at about $60-100K - thats the size of our budget and we have other things to get. What I was thinking was a good machine with a 64bit pci bus and/or multiple buses. And A LOT of external enclosures. If you can't afford either of these solutions, 73gig Seagate Cheetahs are becoming affordable. Packing one of those rackmount 8 bay enclosures with these gets you over 500gb of storage if you just want to stripe them together. That would likely be VERY fast for reads/writes. The risk is that you'd lose everything if one of the disks crashed. this isn't much of a concern. The plan so far was this (and this plan is dependent on what advice I get from here) Raid0 for the read-only data (as its all on tape anyway) Raid5 or Raid1 for the writable data on a second scsi controller. Does this sound reasonable? I've had some uncomfortable experiences with hw raid controllers - ie: VERY poor performance and exbortitant prices. My SW raid experiences under linux have been very good - excellent performance and easy setup and maintenance. (well virtually no maintenance :) -sv
Re: speed and scaling
You will definitely need that 64 bit PCI bus. You might want to watch out for your memory bandwidth as well. (i.e. get something with interleaved memory). Standard PC doesn't get but 800MB/s peak to main memory. FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a gigabit ethernet card, at least under 2.2. I don't have any experience w/ 2.4 yet. On Mon, 10 Jul 2000, Seth Vidal wrote: If you can afford it and this is for real work, you may want to consider something like a Network Appliance Filer. It will be a lot more robust and quite a bit faster than rolling your own array. The downside is they are quite expensive. I believe the folks at Raidzone make a "poor man's" canned array that can stuff almost a terabyte in one box and uses cheaper IDE disks. I priced the netapps - they are ridiculously expensive. They estimated 1tb at about $60-100K - thats the size of our budget and we have other things to get. What I was thinking was a good machine with a 64bit pci bus and/or multiple buses. And A LOT of external enclosures. If you can't afford either of these solutions, 73gig Seagate Cheetahs are becoming affordable. Packing one of those rackmount 8 bay enclosures with these gets you over 500gb of storage if you just want to stripe them together. That would likely be VERY fast for reads/writes. The risk is that you'd lose everything if one of the disks crashed. this isn't much of a concern. The plan so far was this (and this plan is dependent on what advice I get from here) Raid0 for the read-only data (as its all on tape anyway) Raid5 or Raid1 for the writable data on a second scsi controller. Does this sound reasonable? I've had some uncomfortable experiences with hw raid controllers - ie: VERY poor performance and exbortitant prices. My SW raid experiences under linux have been very good - excellent performance and easy setup and maintenance. (well virtually no maintenance :) -sv --- Keith Underwood Parallel Architecture Research Lab (PARL) [EMAIL PROTECTED] Clemson University
Re: speed and scaling
I haven't had very good experiences with the Adaptec cards either. If you can take the performance hit, the Mylex ExtremeRAID cards come in a 3-channel variety. You could then split your array into 3 chunks of 3-4 disks each and use hardware RAID instead of the software raidtools. Cheers, Chris From [EMAIL PROTECTED] Mon Jul 10 17:10:27 2000 i have not used adaptec 160 cards, but i have found most everything else they make to be very finicky about cabling and termination, and have had hard drives give trouble on adaptec that worked fine on other cards. my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and symbios themselves have a nice 64 bit wide, dual channel pci scsi card. which does lead to the point about pci. even _IF_ you could get the entire pci bus to do your disk transfers, you will find that you would still need more bandwidth for stuff like using your nics. so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit pci, or both. perhaps alpha? allan Gregory Leblanc [EMAIL PROTECTED] said: -Original Message- From: Seth Vidal [mailto:[EMAIL PROTECTED]] Sent: Monday, July 10, 2000 12:23 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: speed and scaling So were considering the following: Dual Processor P3 something. ~1gb ram. multiple 75gb ultra 160 drives - probably ibm's 10krpm drives Adaptec's best 160 controller that is supported by linux. The data does not have to be redundant or stable - since it can be restored from tape at almost any time. so I'd like to put this in a software raid 0 array for the speed. So my questions are these: Is 90MB/s a reasonable speed to be able to achieve in a raid0 array across say 5-8 drives? Assuming sequential reads, you should be able to get this from good drives. What controllers/drives should I be looking at? I'm not familiar with current top-end drives, but you should be looking for at least 4MB of cache on the drives. I think the best drives that you'll find will be able to deliver 20MB/sec without trouble, possibly a bit more. I seem to remember somebody on this liking Adaptec cards, but nobody on the SPARC lists will touch the things. I might look at a Tekram, or a Symbios based card, I've heard good things about them, and they're used on some of the bigger machines that I've worked with. Later, Grego -- Christopher Mauritz [EMAIL PROTECTED]
RE: speed and scaling
i have not used adaptec 160 cards, but i have found most everything else they make to be very finicky about cabling and termination, and have had hard drives give trouble on adaptec that worked fine on other cards. my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and symbios themselves have a nice 64 bit wide, dual channel pci scsi card. can you tell me the model number on that card? which does lead to the point about pci. even _IF_ you could get the entire pci bus to do your disk transfers, you will find that you would still need more bandwidth for stuff like using your nics. right. so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit pci, or both. perhaps alpha? the money I would spend on an alpha precludes that option But some of dell's server systems support 64bit buses. thanks -sv
Re: speed and scaling
FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a gigabit ethernet card, at least under 2.2. I don't have any experience w/ 2.4 yet. I hadn't planned on implementing this under 2.2 - I realize the constraints on the network performance. I've heard good things about 2.4's ability to scale to those levels though. thanks for the advice. -sv
Re: speed and scaling
On Mon, Jul 10, 2000 at 05:40:54PM -0400, Seth Vidal wrote: FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a gigabit ethernet card, at least under 2.2. I don't have any experience w/ 2.4 yet. I hadn't planned on implementing this under 2.2 - I realize the constraints on the network performance. I've heard good things about 2.4's ability to scale to those levels though. 2.4.x technically doesn't exist yet. There are some (pre) test versions by Linux and Alan Cox out awaiting feedback from testers, but nothing solid or consistent yet. Be careful when using these for serious work. Newer != Better Phil -- Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR [EMAIL PROTECTED] -- http://www.netroedge.com/~phil PGP F16: 01 D2 FD 01 B5 46 F4 F0 3A 8B 9D 7E 14 7F FB 7A
RE: speed and scaling
I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved memory access, to improve memory bandwidth. It costs around $1 with 512MB of RAM, see SWT (or STW) or Microway. This cost is small compared to the disks. I've never had trouble with adaptec cards, if you terminate things according to specs, and preferably use terminators on the cables, not the card one. In fact I once had a termination problem, because I was using the card for it... It hosed my raid5 array, because there were two disks on that card... Another advantage of the alpha is that you have more PCI slots. I'd put 3 disks on each card, and use about 4 of them per machine. This should be enough to get you 500GB. If there are lots of network traffic this will likely be your bottleneck, particularly because of latency. Might I also suggest a good UPS system? :-) Ah, and a journaling FS...
Re: speed and scaling
There are some (pre) test versions by Linux and Alan Cox out awaiting feedback from testers, but nothing solid or consistent yet. Be careful when using these for serious work. Newer != Better This isn't being planned for the next few weeks - its 2-6month planning that I'm doing. So I'm estimating that 2.4 should be out w/i 6months. I think thats a reasonable guess. -sv
RE: speed and scaling
I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved memory access, to improve memory bandwidth. It costs around $1 with 512MB of RAM, see SWT (or STW) or Microway. This cost is small compared to the disks. The alpha comes with other headaches I'd rather not involve myself with - in addition the costs of the disks is trivial - 7 75gig scsi's @$1k each is only $7k - and the machine housing the machines also needs to be one which will do some of the processing - and all of their code is X86 - so I'm hesistant to suggest alphas for this. Another advantage of the alpha is that you have more PCI slots. I'd put 3 disks on each card, and use about 4 of them per machine. This should be enough to get you 500GB. More how - the current boards I'm working with have 6-7 pci slots - no ISA's at all. The alphas we have here have the same number of slots. Might I also suggest a good UPS system? :-) Ah, and a journaling FS... the ups is a must -the journaling filesystem is at issue too - In an ideal world there will be a Journaling File system that works correctly with sw raid :) -sv
Re: speed and scaling
From [EMAIL PROTECTED] Mon Jul 10 17:53:34 2000 If you can take the performance hit, the Mylex ExtremeRAID cards come in a 3-channel variety. You could then split your array into 3 chunks of 3-4 disks each and use hardware RAID instead of the software raidtools. I've not had good performance out of mylex. In fact its been down-right shoddy. I'm hesistant to purchase from them again. Unfortunately, they are the Ferarri of the hardware RAID cards. Compare an ExtremeRAID card against anything from DPT or ICP-Vortex. There is no comparison. I'm not sure if it's poor hardware design or just brilliant driver design by Leonard Zubkof, but the Mylex cards are the performance king for hardware RAID under Linux (and Windows NT/2K for that matter). Cheers, Chris -- Christopher Mauritz [EMAIL PROTECTED]
Re: speed and scaling
From [EMAIL PROTECTED] Mon Jul 10 18:43:11 2000 There are some (pre) test versions by Linux and Alan Cox out awaiting feedback from testers, but nothing solid or consistent yet. Be careful when using these for serious work. Newer != Better This isn't being planned for the next few weeks - its 2-6month planning that I'm doing. So I'm estimating that 2.4 should be out w/i 6months. I think thats a reasonable guess. That's a really bad assumption. 2.4 has been a "real soon now" item since Januaryand it still is hanging in the vapors. If you're doing this for "production work," I'd plan on a 2.2 kernel or some known "safe" 2.3 kernel. C -- Christopher Mauritz [EMAIL PROTECTED]
Re: speed and scaling
On Mon, 10 Jul 2000, Seth Vidal wrote: What I was thinking was a good machine with a 64bit pci bus and/or multiple buses. And A LOT of external enclosures. Multiple Mylex extremeRAID's. I've had some uncomfortable experiences with hw raid controllers - ie: VERY poor performance and exbortitant prices. You're thinking of DPT :) The Mylex stuff (at least the low end AccleRAID's) are cheap and not too slow. -- Jon Lewis *[EMAIL PROTECTED]*| I route System Administrator| therefore you are Atlantic Net| _ http://www.lewis.org/~jlewis/pgp for PGP public key_
Re: speed and scaling
On Mon, 10 Jul 2000, Seth Vidal wrote: arguably only 500gb per machine will be needed. I'd like to get the fastest possible access rates from a single machine to the data. Ideally 90MB/s+ Is this vastly read-only or will write speed also be a factor? mostly read-only. If it were me, I'd do big RAID5 arrays. Sure, you have the data on tape, but do you want to sit around while hundreds of GB are restored from tape? RAID5 should give you the read speed of RAID0, and if you're not writing much, the write penalty shoulnd't be so bad. If it were totally read-only, you could mount ro, and save yourself considerable fsck time if there's an impropper shutdown. -- Jon Lewis *[EMAIL PROTECTED]*| I route System Administrator| therefore you are Atlantic Net| _ http://www.lewis.org/~jlewis/pgp for PGP public key_