Re: speed and scaling

2000-07-21 Thread Dan Jones

Seth Vidal wrote:
 
 Hi folks,
  I have an odd question. Where I work we will, in the next year, be in a
 position to have to process about a terabyte or more of data. The data is
 probably going to be shipped on tapes to us but then it needs to be read
 from disks and analyzed. The process is segmentable so its reasonable to
 be able to break it down into 2-4 sections for processing so arguably only
 500gb per machine will be needed. I'd like to get the fastest possible
 access rates from a single machine to the data. Ideally 90MB/s+
 
 So were considering the following:
 
 Dual Processor P3 something.
 ~1gb ram.
 multiple 75gb ultra 160 drives - probably ibm's 10krpm drives

Something to think about regarding IBM 10K drives:

http://www8.zdnet.com/eweek/stories/general/0,11011,2573067,00.html

 Adaptec's best 160 controller that is supported by linux.
 
 The data does not have to be redundant or stable - since it can be
 restored from tape at almost any time.
 
 so I'd like to put this in a software raid 0 array for the speed.
 
 So my questions are these:
  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
 across say 5-8 drives?
 What controllers/drives should I be looking at?
 
 And has anyone worked with gigabit connections to an array of this size
 for nfs access? What sort of speeds can I optimally (figuring nfsv3 in
 async mode from the 2.2 patches or 2.4 kernels) expect to achieve for
 network access.
 
 thanks
 -sv

Lots of discussion on this already, so I will just touch on a few
points.

Jon Lewis mentioned that you should place a value on how long it
takes to read in your data sets when considering the value of RAID.
Unfortunately, RAID write throughput can be relatively slow for
HW RAID compared to SW RAID. I've appended some numbers for 
reference.

Also, the process of reading in from tape will double the load on
a single PCI bus. I think you will be happier with a dual PCI bus
MB. 

One of the the eternal unknowns is how well a particular Intel (or
clone) shipset will work for a particular I/O load. Alpha and Sparc
MB are designed with I/O as a higher priority goal than Intel MBs. 
Intel is getting better at this, but contention between the PCI 
busses for memory access can be a problem as well.

Brian Pomerantz at LLNL has gotten more than 90MB/s streaming to a
Ciprico RAID system, but he went to a fair amount of work to get 
there i.e. 2MB block sizes. You probably want to talk to him and 
you should be able to find post from him in the raid archives.



Hardware: dual PIII 600Mhz/Lancewood/128MB and 1GB
Mylex: 150, 1100,  352
Mylex cache: writethru
Disks: 5 Atlas V

Some caveats:

1. The focus is on sequential performance. YMMV

2. The write number for SW RAID5 is surprisingly good. It either
indicates excellent cache managment and reuse of parity blocks or
some understanding of the sequential nature of the bonnie benchmark. 
A RAID5 update should be approximately 25% of raw write performance 
with no caching assistance. 

3. I am a little bothered by the very strong correlation between CPU% 
and MB/s for all of the Mylex controllers for the bonnie tests. I
guess that is the I/O service overhead, but it still seems high ot me.

4. The HW RAID numbers are for 5 drives. The SW RAID numbers are for
8 drives.

5. Effect of CPU memory on bonnie (AcceleRAID 150) read performance
is 15-20%. See below:


AcceleRAID 150 
DRAM=256MB
ReadWrite
BW MB/s CPU BW  CPU

RAID3   42.334%  4.6 3%

RAID5   43.038%  4.5 3%

RAID6(0+1)  37.533% 12.711%

DRAM=1GB
ReadWrite
BW MB/s CPU BW  CPU

RAID3   48.450%  4.6 3%

RAID5   49.151%  4.5 3%

RAID6(0+1)  45.239% 12.710%

6. The Mylex eXtremeRAID1100 does not show much difference in
   write performance between RAID5 and RAID6. See below:

-

ExtremeRAID 1100 1GB
ReadWrite
BW MB/s CPU BW  CPU

RAID3   48.350% 14.713%

RAID5   52.755% 15.113%

RAID6(0+1)  48.149% 14.7*   13%

RAID0   56.360% 40.837%

* This should be better
-

AcceleRAID 352  1GB
ReadWrite
BW MB/s CPU BW  CPU

RAID3   45.244%  6.8 5%

RAID5   46.245%  6.6 5%

RAID6(0+1)  39.639% 16.7*   14%

RAID0   50.550% 36.730%

* This is better


 
 After talking to Dan Jones and the figures he was getting on Mylex cards, I
 decided to do some simple software raid benchmarks.
 
 Hardware: dual PIII 500Mhz/Nightshade/128MB
 SCSI: NCR 53c895 (ultra 2 lvd 80MB/s)
 Disks: 8 18G 

Re: speed and scaling

2000-07-19 Thread Jens Klaas

Quoted mail:
 Dan Hollis wrote:

... some parts are cutted .

 
 I never denied that such beasts exist. I just wanted to point out that a
 x86 machine with those mobos would come close in price to the alpha
 solution.
 I simply can't imagine that there are no alpha boxen with more than 2
 PCI busses. If I had a faster internet connection now, I'd check the web
 site of alpha-processor Inc.

I use a Siemens D111 board/chipset 450NX. It has 3 PCI buses.
It is a PIII/Xeon board, not very cheap but with one 64 bit PCI bus which
is done by coupleling two 32 bit PCI uses. 


cu
Jens
 
 
 Marc
 
 -- 
 Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/
 University of Bielefeld, Dep. of Mathematics / Dep. of Physics
 
 PGP-keyID's:   0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)
 
 

---End quote

--
Jens Klaas
NEC Europe Ltd.
CC Research Laboratories
Rathausallee 10
D-53757 Sankt Augustin

Phone: 
02241/9252-0  
02241/9252-72 
 
Fax:  
02241/9252-99  

eMail:
[EMAIL PROTECTED]
www.ccrl-nece.de/klaas
--
In sharks we trust. 
--





Re: speed and scaling

2000-07-18 Thread Marc Mutz

Dan Hollis wrote:
 
 On Sat, 15 Jul 2000, Marc Mutz wrote:
  Look, you are an the _very_ wrong track! You may have 6 or 7 PCI
  _slots_, but you have only _one_ bus, i.e. only 133MB/sec bandwidth for
  _all_ 6 or 7 devices. You will not get 90MB/sec real throughput with a
  bus bandwidth of 133MB/sec! And the x86 architecture's memory bandwidth
  is _tiny_ (BX chipset does one or two _dozen_ MB/sec random access, ie.
  12-24 MB/sec).
 
 No. BX does 180mbyte/sec (measured).
 K7 with Via KX133 does 262mbyte/sec (measured).
 

Read more carefully. I said _random_ access, not sequential.

 I'd like to get numbers for real alphas. The only alpha I was able to
 measure was Alphastation 200 4/233. A measly 71mbyte/sec on that piece of
 shit.
 

How old is that "shit" and what were the numbers then on x86?

   The alphas we have here have the same number of slots.
  But not only one bus. They typically have 3 slots/bus.
 
 There are multiple pci bus x86 motherboards. Generally found on systems
 with 6 slots. I have seen x86 motherboards with 3 PCI buses, interrupted

I'd like to see how the x86 memory subsystem can saturate three (or only
two) 533MB/sec 64/66 PCI busses and still have the bandwidth to compute
a 90MB/sec stream of data.

 but the most
 ive seen on alpha or sparc is 2.
 
 -Dan

I never denied that such beasts exist. I just wanted to point out that a
x86 machine with those mobos would come close in price to the alpha
solution.
I simply can't imagine that there are no alpha boxen with more than 2
PCI busses. If I had a faster internet connection now, I'd check the web
site of alpha-processor Inc.

Marc

-- 
Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/
University of Bielefeld, Dep. of Mathematics / Dep. of Physics

PGP-keyID's:   0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)





RE: speed and scaling

2000-07-18 Thread Gregory Leblanc

Enough with the vulgarities.  This doesn't really belong on the RAID list
any longer, but I'll make a few points below.

 -Original Message-
 From: Marc Mutz [mailto:[EMAIL PROTECTED]]
 
The alphas we have here have the same number of slots.
   But not only one bus. They typically have 3 slots/bus.
  
  There are multiple pci bus x86 motherboards. Generally 
 found on systems
  with 6 slots. I have seen x86 motherboards with 3 PCI 
 buses, interrupted
 
 I'd like to see how the x86 memory subsystem can saturate 
 three (or only
 two) 533MB/sec 64/66 PCI busses and still have the bandwidth 
 to compute
 a 90MB/sec stream of data.

A P-III memory subsystem is capable of probably 800MB/sec, I doubt that it
can handle more than that.  Alphas and SPARCs have more than that, but you
pay through the nose for it.  It's also worth noting that x86 shares memory
bandwidth when you do SMP (800MB/sec between two processors), where the EV6
Alphas have a switched memory bus.  I haven't investigated that beyond
reading a couple of papers, but you can find more of that on Compaq's
website.

  but the most
  ive seen on alpha or sparc is 2.
 
 I never denied that such beasts exist. I just wanted to point 
 out that a
 x86 machine with those mobos would come close in price to the alpha
 solution.
 I simply can't imagine that there are no alpha boxen with more than 2
 PCI busses. If I had a faster internet connection now, I'd 
 check the web
 site of alpha-processor Inc.

http://www.compaq.com/alphaserver/gs80/index.html  This isn't even the top
of the line Alpha, but it has 16 (that's F in hex, or 20 in octal) PCI
busses.  Again, you PAY for that kind of machine.  I'm sure that the top of
the Alpha line must be close to as expensive as a Sun Ultra Enterprise
1, which goes for about seven figures.  Now that we know that you can
get bigger machines out of Alpha/SPARC than you can out of "stock" x86
machines, and that you have to pay for that kind of performance, can we get
back to talking about RAID on Linux please?
Grego



RE: speed and scaling

2000-07-18 Thread Dan Hollis

On Tue, 18 Jul 2000, Gregory Leblanc wrote:
 http://www.compaq.com/alphaserver/gs80/index.html  This isn't even the top
 of the line Alpha, but it has 16 (that's F in hex, or 20 in octal) PCI
 busses.  Again, you PAY for that kind of machine.

That machine costs around US$2million.

-Dan




Re: speed and scaling

2000-07-14 Thread Marc Mutz

Seth Vidal wrote:
 
  I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved
  memory access, to improve memory bandwidth. It costs around $1
  with 512MB of RAM, see SWT (or STW) or Microway. This cost is
  small compared to the disks.
 The alpha comes with other headaches I'd rather not involve myself with -
 in addition the costs of the disks is trivial - 7 75gig scsi's @$1k each
 is only $7k - and the machine housing the machines also needs to be one
 which will do some of the processing - and all of their code is X86 - so
 I'm hesistant to suggest alphas for this.
 

Look at the reality! If you have to do this sort of thing, x86 will give
you headaches. Normal 133MhZx32Bit PCI is _way_ too slow for that
machine. The entire PCI bus cannot saturate _one_ Ultra160 SCSI
controller, let alone a GigEth card. Putting More than one into a box
and trying to use them concurrently will show you what good normal PCI
is for _really_ fast hardware. You sure want multiple PCI64x66Mhz busses
and _now_ look again at board prices for x86.

Also, if you do data analysis like setiathome does (i.e. mostly FP),
alphas blow away _any_ other microprocessor (setiathome work-unit in
less than an hour; my AMD K6-2 500 needs 18hrs!) Code can be re-compiled
and probably should.

  Another advantage of the alpha is that you have more PCI slots. I'd
  put 3 disks on each card, and use about 4 of them per machine. This
  should be enough to get you 500GB.
 More how - the current boards I'm working with have 6-7 pci slots - no
 ISA's at all.
 

Look, you are an the _very_ wrong track! You may have 6 or 7 PCI
_slots_, but you have only _one_ bus, i.e. only 133MB/sec bandwidth for
_all_ 6 or 7 devices. You will not get 90MB/sec real throughput with a
bus bandwidth of 133MB/sec! And the x86 architecture's memory bandwidth
is _tiny_ (BX chipset does one or two _dozen_ MB/sec random access, ie.
12-24 MB/sec).

 The alphas we have here have the same number of slots.

But not only one bus. They typically have 3 slots/bus.

 
  Might I also suggest a good UPS system? :-) Ah, and a journaling FS...
 
 the ups is a must  -the journaling filesystem is at issue too - In an
 ideal world there will be a Journaling File system that works correctly
 with sw raid :)
 
 -sv

-- 
Marc Mutz [EMAIL PROTECTED]http://marc.mutz.com/Encryption-HOWTO/
University of Bielefeld, Dep. of Mathematics / Dep. of Physics

PGP-keyID's:   0xd46ce9ab (RSA), 0x7ae55b9e (DSS/DH)




Re: speed and scaling

2000-07-14 Thread Edward Schernau

Boy I don't know who was screaming about PCI bandwidth, but :

1)  I was the first person to mention it, weeks ago, so its no
news flash, and you're no genius for thinking of it.
2)  Stop screaming.
3)  Be informed.

P.S.   Let us know what happens, Seth, this sounds like a cool project!
Ed



Re: speed and scaling

2000-07-13 Thread Malcolm Beattie

Seth Vidal writes:
  I have an odd question. Where I work we will, in the next year, be in a
 position to have to process about a terabyte or more of data. The data is
 probably going to be shipped on tapes to us but then it needs to be read
 from disks and analyzed. The process is segmentable so its reasonable to
 be able to break it down into 2-4 sections for processing so arguably only
 500gb per machine will be needed. I'd like to get the fastest possible
 access rates from a single machine to the data. Ideally 90MB/s+
 
 So were considering the following:
 
 Dual Processor P3 something.
 ~1gb ram.
 multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
 Adaptec's best 160 controller that is supported by linux. 
 
 The data does not have to be redundant or stable - since it can be
 restored from tape at almost any time.
 
 so I'd like to put this in a software raid 0 array for the speed.
 
 So my questions are these:
  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
 across say 5-8 drives?
 What controllers/drives should I be looking at?

Here are actual benchmarks from one of my systems.

dbench:

2 Throughput 123.637 MB/sec (NB=154.546 MB/sec  1236.37 MBit/sec)
4 Throughput 109.7 MB/sec (NB=137.126 MB/sec  1097 MBit/sec)
32 Throughput 77.7743 MB/sec (NB=97.2178 MB/sec  777.743 MBit/sec)
64 Throughput 64.3793 MB/sec (NB=80.4741 MB/sec  643.793 MBit/sec)


Bonnie:

  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 2000  9585 99.1 51312 26.0 28675 45.3  9224 94.8 81720 73.3 512.2  4.2

That is with a Dell 4400, 2 x 600 MHz Pentium III Coppermine CPUs
with 256K cache, 1GB RAM, one 64-bit 66MHz PCI bus (and one 33MHz
PCI bus). Disk subsystem is a built-in Adaptec 7899 dual-160MB
channel with 8 Quantum ATLAS IV 9MB SCA disks attached to one
channel. The benchmarks above were done on an ext2 filesystem with
4KB blocksize and stride 16 created on a 7-way stripe of the above
disks using software RAID (0.9 on kernel 2.2.x) with 64KB chunksize.

If you use both SCSI channels and use 36GB disks (let alone 75GB
ones), you'll get 480GB of disk without even needing to plug another
SCSI card in. With larger disks or another SCSI card or two you could
go larger/faster.

--Malcolm

-- 
Malcolm Beattie [EMAIL PROTECTED]
Unix Systems Programmer
Oxford University Computing Services



Re: speed and scaling

2000-07-11 Thread Krisztin Tams


  Hi!

 or just brilliant driver design by Leonard Zubkof, but the Mylex 
 cards are the performance king for hardware RAID under Linux (and
I was suprised to hear that. We just bought a Mylex AcceleRAID
250 with five 18G IBM disks, and it's a lot slower than sw RAID.

HW RAID 5  read:  26MB/s  write:   9MB/s  
SW RAID 5  read:  43MB/s  write:  39MB/s
SW RAID 0  read:  62MB/s  write:  65MB/s

Tamas




Re: speed and scaling

2000-07-11 Thread Chris Mauritz

 From [EMAIL PROTECTED] Tue Jul 11 05:21:43 2000
 
   Hi!
 
  or just brilliant driver design by Leonard Zubkof, but the Mylex 
  cards are the performance king for hardware RAID under Linux (and
 I was suprised to hear that. We just bought a Mylex AcceleRAID
 250 with five 18G IBM disks, and it's a lot slower than sw RAID.
 
 HW RAID 5  read:  26MB/s  write:   9MB/s  
 SW RAID 5  read:  43MB/s  write:  39MB/s
 SW RAID 0  read:  62MB/s  write:  65MB/s

The 250 is their lowest end card.  The ExtremeRAID cards have much
faster processors and more memory.  However, software RAID is almost
always faster.  I believe I mentioned that in my original note.

Cheers,

Chris
-- 
Christopher Mauritz
[EMAIL PROTECTED]



Re: speed and scaling

2000-07-11 Thread jlewis

On Tue, 11 Jul 2000, [ISO-8859-2] Krisztián Tamás wrote:

  or just brilliant driver design by Leonard Zubkof, but the Mylex 
  cards are the performance king for hardware RAID under Linux (and
 I was suprised to hear that. We just bought a Mylex AcceleRAID
 250 with five 18G IBM disks, and it's a lot slower than sw RAID.
 
 HW RAID 5  read:  26MB/s  write:   9MB/s  
 SW RAID 5  read:  43MB/s  write:  39MB/s

That's about the same I get with an AcceleRAID 150 with 16mb and 5 IBM 9GB
7200rpm drives doing RAID5.  The RAID controller CPU's just don't seem to
keep up with system processor speedwise.  Of course, these are Mylex's
low-end entry level RAID cards.  I want to try out an ExtremeRAID and see
if write speed gets much better.

--
 Jon Lewis *[EMAIL PROTECTED]*|  I route
 System Administrator|  therefore you are
 Atlantic Net|  
_ http://www.lewis.org/~jlewis/pgp for PGP public key_




Re: speed and scaling

2000-07-11 Thread bug1

Seth Vidal wrote:
 
 So my questions are these:
  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
 across say 5-8 drives?
 What controllers/drives should I be looking at?


Im a big IDE fan, and have experimented with raid0 a fair bit, i dreamt
of achieving theses speeds with 5 udma66 drives with multiple
controllers.

My experience is that IDE performance is seriously restricted in linux
and definetaly doesnt scale linearly.

I have had good experiences with promise controllers, but will never
again try and use multiple HPT cards.

Under 2.2.x a IDE raid0 you can get around 50 MB/s max, it effectivly
doesnt scale past 3 drives.

Under 2.4.x-test you can get a bit more than the speed of a single
drive, its just f*cked, it has been a known problem for many months, but
it doesnt look like anything much is going to happen to fix it. I
suspect performance problems maybe tied up with the whole latency thing,
or the elevator code, but thats really just me guessing.

As far as i can tell IDE raid performance is limited by the IDE code not
the raid code.

I guess the good news is that SCSI performance is said to scale
linearly, and sounds like in your situation you would have a few $
available to spend on SCSI hardware.

Im sure a few people here would be interested in seeing some benchmarks
when you do get soemthing happening.

Good luck.

Glenn



Re: speed and scaling

2000-07-10 Thread Henry J. Cobb

arguably only 500gb per machine will be needed. I'd like to get the fastest
possible access rates from a single machine to the data. Ideally 90MB/s+

Is this vastly read-only or will write speed also be a factor?

-HJC




Re: speed and scaling

2000-07-10 Thread Edward Schernau

Seth Vidal wrote:

[monster data set description snipped]

 So were considering the following:
 
 Dual Processor P3 something.
 ~1gb ram.
 multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
 Adaptec's best 160 controller that is supported by linux.
[snip]
 So my questions are these:
  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
 across say 5-8 drives?

While you might get this from your controller data bus, I'm skeptical
of moving this much data consistently across the PCI bus.  I think
it has a maximum of 133 MB/sec bandwidth (33 MHz * 32 bits width).
Especially if (below) you have some network access going on, at 
near gigabit speeds.. you're just pushing lots of data.

 What controllers/drives should I be looking at?

See if there is some sort of system you can get with multiple
PCI busses, bridged or whatnot.

 And has anyone worked with gigabit connections to an array of this size
 for nfs access? What sort of speeds can I optimally (figuring nfsv3 in
 async mode from the 2.2 patches or 2.4 kernels) expect to achieve for
 network access.

I've found vanilla nfs performance to be crummy, but haven't played
with it at all.

Ed
-- 
Edward Schernau,mailto:[EMAIL PROTECTED]
Network Architect   http://www.schernau.com
RC5-64#: 243249 e-gold acct #:131897



RE: speed and scaling

2000-07-10 Thread Gregory Leblanc

 -Original Message-
 From: Seth Vidal [mailto:[EMAIL PROTECTED]]
 Sent: Monday, July 10, 2000 12:23 PM
 To: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: speed and scaling
 
 So were considering the following:
 
 Dual Processor P3 something.
 ~1gb ram.
 multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
 Adaptec's best 160 controller that is supported by linux. 
 
 The data does not have to be redundant or stable - since it can be
 restored from tape at almost any time.
 
 so I'd like to put this in a software raid 0 array for the speed.
 
 So my questions are these:
  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
 across say 5-8 drives?

Assuming sequential reads, you should be able to get this from good drives.

 What controllers/drives should I be looking at?

I'm not familiar with current top-end drives, but you should be looking for
at least 4MB of cache on the drives.  I think the best drives that you'll
find will be able to deliver 20MB/sec without trouble, possibly a bit more.
I seem to remember somebody on this liking Adaptec cards, but nobody on the
SPARC lists will touch the things.  I might look at a Tekram, or a Symbios
based card, I've heard good things about them, and they're used on some of
the bigger machines that I've worked with.  Later,
Grego



Re: speed and scaling

2000-07-10 Thread Chris Mauritz

If you can afford it and this is for real work, you may want to
consider something like a Network Appliance Filer.  It will be
a lot more robust and quite a bit faster than rolling your own
array.  The downside is they are quite expensive.  I believe the
folks at Raidzone make a "poor man's" canned array that can 
stuff almost a terabyte in one box and uses cheaper IDE disks.

If you can't afford either of these solutions, 73gig Seagate
Cheetahs are becoming affordable.  Packing one of those
rackmount 8 bay enclosures with these gets you over 500gb
of storage if you just want to stripe them together.  That
would likely be VERY fast for reads/writes.  The risk is that
you'd lose everything if one of the disks crashed.

Cheers,

Chris

 From [EMAIL PROTECTED] Mon Jul 10 16:46:37 2000
 
 Sounds like fun.  Check out VA Linux's dual CPU boxes.  They also
 offer fast LVD SCSI drives which can be raided together.  I've got one
 dual P3-700 w/ dual 10k LVD drives.  FAST!
 
 I'd suggest staying away from NFS for performance reasons.  I think
 there is a better replacement out there ('coda' or something?). NFS
 will work, but I don't think it's what you want.  You could also try
 connecting the machines through SCSI if you want to share files
 quickly (I haven't done this, but have heard of it).
 
 Good luck!
 
 
 Phil
 
 On Mon, Jul 10, 2000 at 03:22:46PM -0400, Seth Vidal wrote:
  Hi folks,
   I have an odd question. Where I work we will, in the next year, be in a
  position to have to process about a terabyte or more of data. The data is
  probably going to be shipped on tapes to us but then it needs to be read
  from disks and analyzed. The process is segmentable so its reasonable to
  be able to break it down into 2-4 sections for processing so arguably only
  500gb per machine will be needed. I'd like to get the fastest possible
  access rates from a single machine to the data. Ideally 90MB/s+
  [...]
 
 -- 
 Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
[EMAIL PROTECTED] -- http://www.netroedge.com/~phil
  PGP F16: 01 D2 FD 01 B5 46 F4 F0  3A 8B 9D 7E 14 7F FB 7A
 


-- 
Christopher Mauritz
[EMAIL PROTECTED]



Re: speed and scaling

2000-07-10 Thread Seth Vidal

 arguably only 500gb per machine will be needed. I'd like to get the fastest
 possible access rates from a single machine to the data. Ideally 90MB/s+
 
 Is this vastly read-only or will write speed also be a factor?

mostly read-only.

-sv





RE: speed and scaling

2000-07-10 Thread m . allan noah

i have not used adaptec 160 cards, but i have found most everything else they
make to be very finicky about cabling and termination, and have had hard
drives give trouble on adaptec that worked fine on other cards.

my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and
symbios themselves have a nice 64 bit wide, dual channel pci scsi card.

which does lead to the point about pci. even _IF_ you could get the entire pci
bus to do your disk transfers, you will find that you would still need more
bandwidth for stuff like using your nics.

so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit
pci, or both. perhaps alpha?

allan

Gregory Leblanc [EMAIL PROTECTED] said:

  -Original Message-
  From: Seth Vidal [mailto:[EMAIL PROTECTED]]
  Sent: Monday, July 10, 2000 12:23 PM
  To: [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]
  Subject: speed and scaling
  
  So were considering the following:
  
  Dual Processor P3 something.
  ~1gb ram.
  multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
  Adaptec's best 160 controller that is supported by linux. 
  
  The data does not have to be redundant or stable - since it can be
  restored from tape at almost any time.
  
  so I'd like to put this in a software raid 0 array for the speed.
  
  So my questions are these:
   Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
  across say 5-8 drives?
 
 Assuming sequential reads, you should be able to get this from good drives.
 
  What controllers/drives should I be looking at?
 
 I'm not familiar with current top-end drives, but you should be looking for
 at least 4MB of cache on the drives.  I think the best drives that you'll
 find will be able to deliver 20MB/sec without trouble, possibly a bit more.
 I seem to remember somebody on this liking Adaptec cards, but nobody on the
 SPARC lists will touch the things.  I might look at a Tekram, or a Symbios
 based card, I've heard good things about them, and they're used on some of
 the bigger machines that I've worked with.  Later,
   Grego
 






Re: speed and scaling

2000-07-10 Thread Seth Vidal

 If you can afford it and this is for real work, you may want to
 consider something like a Network Appliance Filer.  It will be
 a lot more robust and quite a bit faster than rolling your own
 array.  The downside is they are quite expensive.  I believe the
 folks at Raidzone make a "poor man's" canned array that can 
 stuff almost a terabyte in one box and uses cheaper IDE disks.

I priced the netapps - they are ridiculously expensive. They estimated 1tb
at about $60-100K - thats the size of our budget and we have other things
to get.

What I was thinking was a good machine with a 64bit pci bus and/or
multiple buses.
And A LOT of external enclosures.

 If you can't afford either of these solutions, 73gig Seagate
 Cheetahs are becoming affordable.  Packing one of those
 rackmount 8 bay enclosures with these gets you over 500gb
 of storage if you just want to stripe them together.  That
 would likely be VERY fast for reads/writes.  

 The risk is that you'd lose everything if one of the disks crashed.

this isn't much of a concern.
The plan so far was this (and this plan is dependent on what advice I get
from here)

Raid0 for the read-only data (as its all on tape anyway)
Raid5 or Raid1 for the writable data on a second scsi controller.

Does this sound reasonable?

I've had some uncomfortable experiences with hw raid controllers -
ie: VERY poor performance and exbortitant prices.
My SW raid experiences under linux have been very good - excellent
performance and easy setup and maintenance. (well virtually no maintenance
:)

-sv






Re: speed and scaling

2000-07-10 Thread Keith Underwood

You will definitely need that 64 bit PCI bus.  You might want to watch out
for your memory bandwidth as well. (i.e. get something with interleaved
memory).  Standard PC doesn't get but 800MB/s peak to main memory.

FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a
gigabit ethernet card, at least under 2.2.  I don't have any experience w/
2.4 yet.  

On Mon, 10 Jul 2000, Seth Vidal wrote:

  If you can afford it and this is for real work, you may want to
  consider something like a Network Appliance Filer.  It will be
  a lot more robust and quite a bit faster than rolling your own
  array.  The downside is they are quite expensive.  I believe the
  folks at Raidzone make a "poor man's" canned array that can 
  stuff almost a terabyte in one box and uses cheaper IDE disks.
 
 I priced the netapps - they are ridiculously expensive. They estimated 1tb
 at about $60-100K - thats the size of our budget and we have other things
 to get.
 
 What I was thinking was a good machine with a 64bit pci bus and/or
 multiple buses.
 And A LOT of external enclosures.
 
  If you can't afford either of these solutions, 73gig Seagate
  Cheetahs are becoming affordable.  Packing one of those
  rackmount 8 bay enclosures with these gets you over 500gb
  of storage if you just want to stripe them together.  That
  would likely be VERY fast for reads/writes.  
 
  The risk is that you'd lose everything if one of the disks crashed.
 
 this isn't much of a concern.
 The plan so far was this (and this plan is dependent on what advice I get
 from here)
 
 Raid0 for the read-only data (as its all on tape anyway)
 Raid5 or Raid1 for the writable data on a second scsi controller.
 
 Does this sound reasonable?
 
 I've had some uncomfortable experiences with hw raid controllers -
 ie: VERY poor performance and exbortitant prices.
 My SW raid experiences under linux have been very good - excellent
 performance and easy setup and maintenance. (well virtually no maintenance
 :)
 
 -sv
 
 
 
 

---
Keith Underwood   Parallel Architecture Research Lab (PARL)
[EMAIL PROTECTED]  Clemson University




Re: speed and scaling

2000-07-10 Thread Chris Mauritz

I haven't had very good experiences with the Adaptec cards either.

If you can take the performance hit, the Mylex ExtremeRAID cards
come in a 3-channel variety.  You could then split your array
into 3 chunks of 3-4 disks each and use hardware RAID instead of
the software raidtools.

Cheers,

Chris

 From [EMAIL PROTECTED] Mon Jul 10 17:10:27 2000
 
 i have not used adaptec 160 cards, but i have found most everything else they
 make to be very finicky about cabling and termination, and have had hard
 drives give trouble on adaptec that worked fine on other cards.
 
 my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and
 symbios themselves have a nice 64 bit wide, dual channel pci scsi card.
 
 which does lead to the point about pci. even _IF_ you could get the entire pci
 bus to do your disk transfers, you will find that you would still need more
 bandwidth for stuff like using your nics.
 
 so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit
 pci, or both. perhaps alpha?
 
 allan
 
 Gregory Leblanc [EMAIL PROTECTED] said:
 
   -Original Message-
   From: Seth Vidal [mailto:[EMAIL PROTECTED]]
   Sent: Monday, July 10, 2000 12:23 PM
   To: [EMAIL PROTECTED]
   Cc: [EMAIL PROTECTED]
   Subject: speed and scaling
   
   So were considering the following:
   
   Dual Processor P3 something.
   ~1gb ram.
   multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
   Adaptec's best 160 controller that is supported by linux. 
   
   The data does not have to be redundant or stable - since it can be
   restored from tape at almost any time.
   
   so I'd like to put this in a software raid 0 array for the speed.
   
   So my questions are these:
Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
   across say 5-8 drives?
  
  Assuming sequential reads, you should be able to get this from good drives.
  
   What controllers/drives should I be looking at?
  
  I'm not familiar with current top-end drives, but you should be looking for
  at least 4MB of cache on the drives.  I think the best drives that you'll
  find will be able to deliver 20MB/sec without trouble, possibly a bit more.
  I seem to remember somebody on this liking Adaptec cards, but nobody on the
  SPARC lists will touch the things.  I might look at a Tekram, or a Symbios
  based card, I've heard good things about them, and they're used on some of
  the bigger machines that I've worked with.  Later,
  Grego
  
 
 
 
 


-- 
Christopher Mauritz
[EMAIL PROTECTED]



RE: speed and scaling

2000-07-10 Thread Seth Vidal

 i have not used adaptec 160 cards, but i have found most everything else they
 make to be very finicky about cabling and termination, and have had hard
 drives give trouble on adaptec that worked fine on other cards.
 
 my money stays with a lsi/symbios/ncr based card. tekram is a good vendor, and
 symbios themselves have a nice 64 bit wide, dual channel pci scsi card.
can you tell me the model number on that card?

 which does lead to the point about pci. even _IF_ you could get the entire pci
 bus to do your disk transfers, you will find that you would still need more
 bandwidth for stuff like using your nics.
right.

 so, i suggest you investigate a motherboard with either 66mhz pci or 64 bit
 pci, or both. perhaps alpha?
the money I would spend on an alpha precludes that option

But some of dell's server systems support 64bit buses.

thanks
-sv







Re: speed and scaling

2000-07-10 Thread Seth Vidal

 FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a
 gigabit ethernet card, at least under 2.2.  I don't have any experience w/
 2.4 yet.  
I hadn't planned on implementing this under 2.2 - I realize the
constraints on the network performance. I've heard good things about 2.4's
ability to scale to those levels though.

thanks for the advice.

-sv





Re: speed and scaling

2000-07-10 Thread phil

On Mon, Jul 10, 2000 at 05:40:54PM -0400, Seth Vidal wrote:
  FWIW, you are going to have trouble pushing anywhere near 90MB/s out of a
  gigabit ethernet card, at least under 2.2.  I don't have any experience w/
  2.4 yet.  
 I hadn't planned on implementing this under 2.2 - I realize the
 constraints on the network performance. I've heard good things about 2.4's
 ability to scale to those levels though.

2.4.x technically doesn't exist yet.  There are some (pre) test
versions by Linux and Alan Cox out awaiting feedback from testers, but
nothing solid or consistent yet.  Be careful when using these for
serious work.  Newer != Better


Phil

-- 
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
   [EMAIL PROTECTED] -- http://www.netroedge.com/~phil
 PGP F16: 01 D2 FD 01 B5 46 F4 F0  3A 8B 9D 7E 14 7F FB 7A



RE: speed and scaling

2000-07-10 Thread Carlos Carvalho

I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved
memory access, to improve memory bandwidth. It costs around $1
with 512MB of RAM, see SWT (or STW) or Microway. This cost is
small compared to the disks.

I've never had trouble with adaptec cards, if you terminate things
according to specs, and preferably use terminators on the cables, not
the card one. In fact I once had a termination problem, because I was
using the card for it... It hosed my raid5 array, because there were
two disks on that card...

Another advantage of the alpha is that you have more PCI slots. I'd
put 3 disks on each card, and use about 4 of them per machine. This
should be enough to get you 500GB.

If there are lots of network traffic this will likely be your
bottleneck, particularly because of latency.

Might I also suggest a good UPS system? :-) Ah, and a journaling FS...



Re: speed and scaling

2000-07-10 Thread Seth Vidal

 There are some (pre) test
 versions by Linux and Alan Cox out awaiting feedback from testers, but
 nothing solid or consistent yet.  Be careful when using these for
 serious work.  Newer != Better

This isn't being planned for the next few weeks - its 2-6month planning
that I'm doing. So I'm estimating that 2.4 should be out w/i 6months. I
think thats a reasonable guess.


-sv





RE: speed and scaling

2000-07-10 Thread Seth Vidal

 I'd try an alpha machine, with 66MHz-64bit PCI bus, and interleaved
 memory access, to improve memory bandwidth. It costs around $1
 with 512MB of RAM, see SWT (or STW) or Microway. This cost is
 small compared to the disks.
The alpha comes with other headaches I'd rather not involve myself with -
in addition the costs of the disks is trivial - 7 75gig scsi's @$1k each
is only $7k - and the machine housing the machines also needs to be one
which will do some of the processing - and all of their code is X86 - so
I'm hesistant to suggest alphas for this.

 Another advantage of the alpha is that you have more PCI slots. I'd
 put 3 disks on each card, and use about 4 of them per machine. This
 should be enough to get you 500GB.
More how - the current boards I'm working with have 6-7 pci slots - no
ISA's at all.

The alphas we have here have the same number of slots.


 Might I also suggest a good UPS system? :-) Ah, and a journaling FS...

the ups is a must  -the journaling filesystem is at issue too - In an
ideal world there will be a Journaling File system that works correctly
with sw raid :)

-sv





Re: speed and scaling

2000-07-10 Thread Chris Mauritz

 From [EMAIL PROTECTED] Mon Jul 10 17:53:34 2000
 
  If you can take the performance hit, the Mylex ExtremeRAID cards
  come in a 3-channel variety.  You could then split your array
  into 3 chunks of 3-4 disks each and use hardware RAID instead of
  the software raidtools.
 
 I've not had good performance out of mylex. In fact its been down-right
 shoddy.
 
 I'm hesistant to purchase from them again.

Unfortunately, they are the Ferarri of the hardware RAID cards.

Compare an ExtremeRAID card against anything from DPT or ICP-Vortex.
There is no comparison.  I'm not sure if it's poor hardware design
or just brilliant driver design by Leonard Zubkof, but the Mylex 
cards are the performance king for hardware RAID under Linux (and
Windows NT/2K for that matter).

Cheers,

Chris

-- 
Christopher Mauritz
[EMAIL PROTECTED]



Re: speed and scaling

2000-07-10 Thread Chris Mauritz

 From [EMAIL PROTECTED] Mon Jul 10 18:43:11 2000
 
  There are some (pre) test
  versions by Linux and Alan Cox out awaiting feedback from testers, but
  nothing solid or consistent yet.  Be careful when using these for
  serious work.  Newer != Better
 
 This isn't being planned for the next few weeks - its 2-6month planning
 that I'm doing. So I'm estimating that 2.4 should be out w/i 6months. I
 think thats a reasonable guess.

That's a really bad assumption.  2.4 has been a "real soon now" 
item since Januaryand it still is hanging in the vapors.

If you're doing this for "production work," I'd plan on a 2.2 kernel
or some known "safe" 2.3 kernel.

C
-- 
Christopher Mauritz
[EMAIL PROTECTED]



Re: speed and scaling

2000-07-10 Thread jlewis

On Mon, 10 Jul 2000, Seth Vidal wrote:

 What I was thinking was a good machine with a 64bit pci bus and/or
 multiple buses.
 And A LOT of external enclosures.

Multiple Mylex extremeRAID's.

 I've had some uncomfortable experiences with hw raid controllers -
 ie: VERY poor performance and exbortitant prices.

You're thinking of DPT :)
The Mylex stuff (at least the low end AccleRAID's) are cheap and not too
slow.

--
 Jon Lewis *[EMAIL PROTECTED]*|  I route
 System Administrator|  therefore you are
 Atlantic Net|  
_ http://www.lewis.org/~jlewis/pgp for PGP public key_




Re: speed and scaling

2000-07-10 Thread jlewis

On Mon, 10 Jul 2000, Seth Vidal wrote:

  arguably only 500gb per machine will be needed. I'd like to get the fastest
  possible access rates from a single machine to the data. Ideally 90MB/s+
  
  Is this vastly read-only or will write speed also be a factor?
 
 mostly read-only.

If it were me, I'd do big RAID5 arrays.  Sure, you have the data on tape,
but do you want to sit around while hundreds of GB are restored from tape?
RAID5 should give you the read speed of RAID0, and if you're not writing
much, the write penalty shoulnd't be so bad.  If it were totally
read-only, you could mount ro, and save yourself considerable fsck time if
there's an impropper shutdown. 

--
 Jon Lewis *[EMAIL PROTECTED]*|  I route
 System Administrator|  therefore you are
 Atlantic Net|  
_ http://www.lewis.org/~jlewis/pgp for PGP public key_