On 10/04/10 06:59, Oliver Geisler wrote:
This is the results of skampi pt2pt, first with shared memory allowed, second shared memory excluded.
For what it's worth I can't replicate those results on an AMD Shanghai cluster running a 2.6.32 kernel and Open-MPI 1.4.1. Here is what I see (run under Torque, selecting 2 cores on the same node, so no need to specify -np): $ mpirun --mca btl self,sm,tcp ./skampi -i ski/skampi_pt2pt.ski # begin result "Pingpong_Send_Recv" count= 1 4 2.0 0.0 16 2.0 1.8 count= 2 8 2.1 0.0 16 2.1 1.8 count= 3 12 2.1 0.1 8 2.0 2.0 count= 4 16 2.1 0.1 8 2.0 2.0 count= 6 24 2.0 0.0 16 2.0 1.8 count= 8 32 2.9 0.0 16 2.7 2.4 count= 11 44 2.3 0.1 16 2.2 2.0 count= 16 64 2.2 0.1 16 2.1 2.0 count= 23 92 2.7 0.2 16 2.6 2.1 count= 32 128 2.5 0.1 16 2.5 2.1 count= 45 180 3.0 0.0 16 2.8 2.6 count= 64 256 3.1 0.0 8 3.0 2.5 count= 91 364 3.1 0.0 8 3.0 3.0 count= 128 512 3.4 0.2 16 3.3 3.0 count= 181 724 4.1 0.0 16 4.0 4.1 count= 256 1024 5.0 0.0 8 4.5 4.5 count= 362 1448 6.0 0.0 16 5.8 5.7 count= 512 2048 7.7 0.1 16 7.3 7.6 count= 724 2896 10.0 0.0 8 10.0 9.8 count= 1024 4096 12.3 0.1 16 12.1 12.0 count= 1448 5792 13.8 0.2 8 13.5 13.4 count= 2048 8192 18.1 0.0 16 17.9 18.1 count= 2896 11584 25.0 0.0 16 24.9 25.0 count= 4096 16384 34.2 0.1 16 34.0 34.2 # end result "Pingpong_Send_Recv" # duration = 0.00 sec mpirun --mca btl tcp,self ./skampi -i ski/skampi_pt2pt.ski # begin result "Pingpong_Send_Recv" count= 1 4 21.2 1.0 16 20.1 17.8 count= 2 8 20.8 1.0 16 20.6 16.7 count= 3 12 20.2 0.9 16 19.0 17.1 count= 4 16 19.9 1.0 16 19.0 17.0 count= 6 24 21.1 1.1 16 20.6 17.0 count= 8 32 20.0 1.0 16 18.8 17.1 count= 11 44 20.9 0.8 16 20.0 17.1 count= 16 64 21.7 1.1 16 20.5 17.6 count= 23 92 21.7 1.0 16 20.0 18.5 count= 32 128 21.6 1.0 16 20.5 18.5 count= 45 180 22.0 1.0 16 20.9 19.0 count= 64 256 21.8 0.7 16 20.5 20.2 count= 91 364 20.5 0.3 16 19.8 19.1 count= 128 512 18.5 0.3 8 17.5 18.1 count= 181 724 19.3 0.2 8 19.1 19.0 count= 256 1024 20.3 0.3 16 19.7 20.0 count= 362 1448 22.1 0.3 16 21.2 21.4 count= 512 2048 24.2 0.3 16 23.7 23.2 count= 724 2896 24.8 0.5 8 24.0 24.0 count= 1024 4096 26.8 0.2 16 26.1 26.3 count= 1448 5792 31.6 0.3 16 30.4 31.5 count= 2048 8192 38.0 0.6 16 37.3 37.1 count= 2896 11584 52.1 1.4 16 49.1 50.8 count= 4096 16384 93.8 1.1 16 81.1 91.5 # end result "Pingpong_Send_Recv" # duration = 0.02 sec cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC