On Fri, 2007-03-23 at 03:23 -0400, Patrick Geoffray wrote: > It is unbelievable that so few people denounce it. It is clearly > implemented only to cheat on a micro-benchmark. What's next ? Checking > that the buffer to send is identical to the previous one to avoid > sending "redundant" messages in ping-pong ?!?
Far better to check if the buffer you are sending is just many COW copies of the kernel zero page and if it is get the receiver to mmap() /dev/zero over the recv buffer. Every benchmark should initialise transmitted data before it is sent, if only to prevent page faults inside the timing loop. We don't do this of course but often comment that with a lot of benchmarks we could get fairly large bandwidth numbers if we did. > If you want to show the impact of concurrent communications, something > latency-based like the HPCC ring test is the best way (eventually with > more nodes). The millions of packet per second of a stream-based > benchmark are lovely for the marketing folks, but has little meaning for > real codes that computes a minimum. However, an alltoall on many > cores/nodes would exercise the same metric (many sends/recvs on the same > NIC at the same time), but would be harder to cheat and be much more > meaningful IMHO. Alltoall is one of the hardest functions to optimise purely because of contention in the NIC, the optimisations we do aim to reduce this number and avoid hotspots. It's probably a good thing to benchmark to get a idea of the capability of a given network. Ashley, _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
