Re: Big SMP

Robert G. Brown Wed, 2 Dec 1998 17:43:22 -0500
On Wed, 2 Dec 1998, Paul Komarek wrote:

> 
> Okay, my advisor has settled on our small machine:  a quad Xeon with 2GB.
> But now he's working on the proposal for the big machine, and guess who's
> going to do the groundwork.
> 
> So here's the question:  How big can Linux SMP get?  In particular, how
> many processors, and does it depend upon which architecture?  It isn't
> enough to know what the kernel can support, I need to actually find
> supporting hardware.  We are more loyal to Linux than anything else, but
> can we get eight or sixteen cpu SMP hardware for Linux to run on?  How
> about 32 or 64 cpus?
   (deleted)
> Obviously I'm no multiprocessor guru, but I'm willing to work towards that
> end--if someone can point me to information which has some nonzero
> probability of being helpful, I'd be grateful.  I've spent a lot of time
> on manufacturer's web sites with very little to show for it, so I'm
> willing to follow any new lead.

Check out the beowulf project -- www.beowulf.org.  You don't say whether
or not you're designing a production machine for doing calculations in,
say, a physics department or if you are building a machine for doing
computer science and tinkering with the multiprocessor environment
itself.  If you are just trying to optimize a cost/benefit equation in a
production environment (get the most cycles for the least cost for use
in parallel computations) then a beowulf or related setup (COW, NOW,
CLUMP or the acronym of your choice) is almost certain to be a factor of
10 to 100 more cost-beneficial than buying dedicated multiprocessing
iron, even iron that runs linux.

A beowulf is basically a rack of commodity boxes (e.g. single or SMP
Intel or Alpha boxes) running linux with a fast commodity network
interconnecting them.  The linux itself may be modified in certain key
areas to support the ability to perceive of the whole cluster as a
"single system" and to improve IPC's between nodes.  One writes parallel
programs in PVM or MPI or using some more exotic methods (like raw
sockets, if one can handle them).  NOW/COW/CLUMPs are basically other
kinds of clusters that one uses the same way but without the
modifications or with different ones, where one relaxes the requirement
that all the systems have a "single head" and be perceivable as a
"single parallel supercomputer".

Whether or not this is right for you depends on the "grain" of your
parallel code -- how synchronous its communications need to be, what the
relative ratio of time is spent doing communications to that doing
calculations (for a given task partitioning) is, and stuff like that.
If the task(s) are primarily coarse-to-medium grained (do more
calculation on a node than communication between nodes) then a
beowulf/cluster is definitely the way to go, likely by a factor of 20 or
more in price/performance advantage over e.g. an SP2.  If the
calculations are fine grained (e.g. hydrodynamics, galactic evolution,
things with long range forces and hence IPC's) then a beowulf still
might be the way to go but the cost/benefit is less and you'll have to
work harder to get the benefit.  There are definitely problems for which
the T3E is a better solution than a beowulf, although I personally
believe that this will change drastically in the next two years.

HTH,

  rgb

Robert G. Brown                        http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:[EMAIL PROTECTED]
Re: Big SMP

Reply via email to