On 10/04/10 06:59, Oliver Geisler wrote:
This is the results of skampi pt2pt, first with shared memory allowed,
second shared memory excluded.
For what it's worth I can't replicate those results on an AMD Shanghai
cluster running a 2.6.32 kernel and Open-MPI 1.4.1.
Here is what I see (run under Torque, selecting 2 cores on the same
node, so no need to specify -np):
$ mpirun --mca btl self,sm,tcp ./skampi -i ski/skampi_pt2pt.ski
# begin result "Pingpong_Send_Recv"
count= 14 2.0 0.0 16 2.0 1.8
count= 28 2.1 0.0 16 2.1 1.8
count= 3 12 2.1 0.18 2.0 2.0
count= 4 16 2.1 0.18 2.0 2.0
count= 6 24 2.0 0.0 16 2.0 1.8
count= 8 32 2.9 0.0 16 2.7 2.4
count= 11 44 2.3 0.1 16 2.2 2.0
count= 16 64 2.2 0.1 16 2.1 2.0
count= 23 92 2.7 0.2 16 2.6 2.1
count= 32 128 2.5 0.1 16 2.5 2.1
count= 45 180 3.0 0.0 16 2.8 2.6
count= 64 256 3.1 0.08 3.0 2.5
count= 91 364 3.1 0.08 3.0 3.0
count= 128 512 3.4 0.2 16 3.3 3.0
count= 181 724 4.1 0.0 16 4.0 4.1
count= 256 1024 5.0 0.08 4.5 4.5
count= 362 1448 6.0 0.0 16 5.8 5.7
count= 512 2048 7.7 0.1 16 7.3 7.6
count= 724 2896 10.0 0.08 10.0 9.8
count= 1024 4096 12.3 0.1 16 12.1 12.0
count= 1448 5792 13.8 0.28 13.5 13.4
count= 2048 8192 18.1 0.0 16 17.9 18.1
count= 289611584 25.0 0.0 16 24.9 25.0
count= 409616384 34.2 0.1 16 34.0 34.2
# end result "Pingpong_Send_Recv"
# duration = 0.00 sec
mpirun --mca btl tcp,self ./skampi -i ski/skampi_pt2pt.ski
# begin result "Pingpong_Send_Recv"
count= 14 21.2 1.0 16 20.1 17.8
count= 28 20.8 1.0 16 20.6 16.7
count= 3 12 20.2 0.9 16 19.0 17.1
count= 4 16 19.9 1.0 16 19.0 17.0
count= 6 24 21.1 1.1 16 20.6 17.0
count= 8 32 20.0 1.0 16 18.8 17.1
count= 11 44 20.9 0.8 16 20.0 17.1
count= 16 64 21.7 1.1 16 20.5 17.6
count= 23 92 21.7 1.0 16 20.0 18.5
count= 32 128 21.6 1.0 16 20.5 18.5
count= 45 180 22.0 1.0 16 20.9 19.0
count= 64 256 21.8 0.7 16 20.5 20.2
count= 91 364 20.5 0.3 16 19.8 19.1
count= 128 512 18.5 0.38 17.5 18.1
count= 181 724 19.3 0.28 19.1 19.0
count= 256 1024 20.3 0.3 16 19.7 20.0
count= 362 1448 22.1 0.3 16 21.2 21.4
count= 512 2048 24.2 0.3 16 23.7 23.2
count= 724 2896 24.8 0.58 24.0 24.0
count= 1024 4096 26.8 0.2 16 26.1 26.3
count= 1448 5792 31.6 0.3 16 30.4 31.5
count= 2048 8192 38.0 0.6 16 37.3 37.1
count= 289611584 52.1 1.4 16 49.1 50.8
count= 409616384 93.8 1.1 16 81.1 91.5
# end result "Pingpong_Send_Recv"
# duration = 0.02 sec
cheers,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC