On 10/2/06, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

I agree that generally libraries are a much more practical thing to implement.

I don't know which category sequoia falls into I haven't read the paper,
the only way it sounds interesting to this conversation is that it has
some mildly interesting benchmark numbers associated with it.

At the moment I'm a lot more interested in packets per second and bandwidth
per second for future architectures then flops per seconds.  Cpu architects
understand how to push the flops per second of general purpose processors
quite high and are unlikely to develop amnesia when they have competition
who is just as able as they are.   Where the current challenge comes from
is how do you keep the bandwidth and I/O rates to the outside world growing
at an exponential rate.  Maybe big SMPs on a chip will help with this if they 
can
isolate most of the traffic inside them selves but I doubt it.

See Table 3 and Figure 11 on pages 9 and 10, respectively.  Table 3 reports
bandwidth for memory bound benchmarks, and Figure 11 graphs time waiting
on memory, overhead (barrier sync + runtime logic), and leaf task execution
time.

For memory bound SAXPY and SGEMV they report 18-22.1 GB/s out of a
theoretical 25.6 GB/s for the XDR memory.

They don't list the bandwidth numbers for the cluster, though they do show
the performance delta between pre-distributed and without pre-distibution.

Andrew

--
Andrew Shewmaker
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to