On 10/2/06, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
I agree that generally libraries are a much more practical thing to implement. I don't know which category sequoia falls into I haven't read the paper, the only way it sounds interesting to this conversation is that it has some mildly interesting benchmark numbers associated with it. At the moment I'm a lot more interested in packets per second and bandwidth per second for future architectures then flops per seconds. Cpu architects understand how to push the flops per second of general purpose processors quite high and are unlikely to develop amnesia when they have competition who is just as able as they are. Where the current challenge comes from is how do you keep the bandwidth and I/O rates to the outside world growing at an exponential rate. Maybe big SMPs on a chip will help with this if they can isolate most of the traffic inside them selves but I doubt it.
See Table 3 and Figure 11 on pages 9 and 10, respectively. Table 3 reports bandwidth for memory bound benchmarks, and Figure 11 graphs time waiting on memory, overhead (barrier sync + runtime logic), and leaf task execution time. For memory bound SAXPY and SGEMV they report 18-22.1 GB/s out of a theoretical 25.6 GB/s for the XDR memory. They don't list the bandwidth numbers for the cluster, though they do show the performance delta between pre-distributed and without pre-distibution. Andrew -- Andrew Shewmaker _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
