On Thu, 2002-01-31 at 12:24, Andrew Perrin wrote: > Okay, this will be fun :) > > I'm putting together a research grant for some fairly heavy text crunching > (categorizing thousands of documents using statistical methods). At the > moment the grant is in the "reach for the sky" phase, meaning look for the > best-possible technical solution. Eventually, of course, we will probably > have to cut down. > > But for now, I'd like advice on hardware, potentially costing as much as > $25,000 for this project. I'm open to clustered solutions as well as > single-machine solutions, although I don't want to spend much time keeping > the cluster going. Things I've thought of include: > > - IBM IntelliStation Z line > - Sun Enterprise 450 or 420R > - SGI 2200 or something like that (don't know SGI's line well) > - Building a standard Intel-based system (dual fast processors, 4G RAM, > fast SCSI disks, etc.) > > So, what would you do?
I assume by text processing you mean mostly integer work with some floating point. In either case, you should be aware of the SPEC benchmarks: http://www.spec.org/osg/cpu2000/results/ and read how the benchmark scores are calculated before browsing. You'll notice that, at the moment, the AMD Athlons are the best in terms of operations (either floating point or integer) per second per dollar. You can get dual Athlon systems for very competitive prices online. Or pick up a recent copy of Linux Journal and you'll see multiple ads for companies selling dual-Athlon systems that come with Linux pre-loaded and pre-configured. For $25K you could build a small cluster. But getting back to the original question: how do you know whether your application(s) will be CPU-bound? If you're doing a lot of searching, your work is more likely to be IO-bound and in that case you're better off getting relatively cheaper/slower CPUs and putting your grant money into a large/fast SCSI array. hth, Ed -- Edward H. Hill III, PhD Post-Doctoral Researcher | Email: [EMAIL PROTECTED], [EMAIL PROTECTED] Division of ESE | URL: http://www.eh3.com Colorado School of Mines | Phone: 303-273-3483 Golden, CO 80401 | Fax: 303-273-3311 Key fingerprint = 5BDE 4DA1 66BE 4F7B BC17 3A0C 932B 7266 1E76 F123
signature.asc
Description: This is a digitally signed message part
