Hi folks,
 I have an odd question. Where I work we will, in the next year, be in a
position to have to process about a terabyte or more of data. The data is
probably going to be shipped on tapes to us but then it needs to be read
from disks and analyzed. The process is segmentable so its reasonable to
be able to break it down into 2-4 sections for processing so arguably only
500gb per machine will be needed. I'd like to get the fastest possible
access rates from a single machine to the data. Ideally 90MB/s+

So were considering the following:

Dual Processor P3 something.
~1gb ram.
multiple 75gb ultra 160 drives - probably ibm's 10krpm drives
Adaptec's best 160 controller that is supported by linux. 

The data does not have to be redundant or stable - since it can be
restored from tape at almost any time.

so I'd like to put this in a software raid 0 array for the speed.

So my questions are these:
 Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
across say 5-8 drives?
What controllers/drives should I be looking at?

And has anyone worked with gigabit connections to an array of this size
for nfs access? What sort of speeds can I optimally (figuring nfsv3 in 
async mode from the 2.2 patches or 2.4 kernels) expect to achieve for
network access.

thanks
-sv





Reply via email to