Hi,
can you please run again with
--mca pml ob1
if Open MPI was built with mxm support, pml/cm and mtl/mxm are used
instead of pml/ob1 and btl/openib
Cheers,
Gilles
On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:
Hi folks,
I saw a performance degradation of openmpi-2.0.0 when I ran our application
on a node (12cores). So I did 4 tests using osu_bw as below:
1: mpirun –np 2 osu_bw bad(30% of test2)
2: mpirun –np 2 –mca btl self,sm osu_bw good(same as openmpi1.10.3)
3: mpirun –np 2 –mca btl self,sm,openib osu_bw bad(30% of test2)
4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)
I guess openib BTL was used in the test 1 and 3, because these results are
almost same as test 4. I believe that sm BTL should be used even in the
test 1 and 3, because its priority is higher than openib. Unfortunately, at
the moment, I couldn’t figure out the root cause. So please someone would
take care of it.
Regards,
Tetsuya Mishima
P.S. Here I attached these test results.
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -bind-to core
-report-bindings osu_bw
[manage.cluster:13389] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13389] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 1.49
2 3.04
4 6.13
8 12.23
16 25.01
32 49.96
64 87.07
128 138.87
256 245.97
512 423.30
1024 865.85
2048 1279.63
4096 264.79
8192 473.92
16384 739.27
32768 1030.49
65536 1190.21
131072 1270.77
262144 1238.74
524288 1245.97
1048576 1260.09
2097152 1274.53
4194304 1285.07
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl self,sm
-bind-to core -report-bindings osu_bw
[manage.cluster:13448] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13448] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 0.51
2 1.01
4 2.03
8 4.08
16 7.92
32 16.16
64 32.53
128 64.30
256 128.19
512 256.48
1024 468.62
2048 785.29
4096 854.78
8192 1404.51
16384 2249.20
32768 3136.40
65536 3495.84
131072 3436.69
262144 3392.11
524288 3400.07
1048576 3460.60
2097152 3488.09
4194304 3498.45
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl
self,sm,openib -bind-to core -report-bindings osu_bw
[manage.cluster:13462] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13462] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 0.54
2 1.09
4 2.18
8 4.37
16 8.75
32 17.37
64 34.67
128 66.66
256 132.55
512 261.52
1024 489.51
2048 818.38
4096 290.48
8192 511.64
16384 765.24
32768 1043.28
65536 1180.48
131072 1261.41
262144 1232.86
524288 1245.70
1048576 1245.69
2097152 1268.67
4194304 1281.33
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl self,openib
-bind-to core -report-bindings osu_bw
[manage.cluster:13521] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13521] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 0.54
2 1.08
4 2.16
8 4.34
16 8.64
32 17.25
64 34.30
128 66.13
256 129.99
512 242.26
1024 429.24
2048 556.00
4096 706.80
8192 874.35
16384 762.60
32768 1039.61
65536 1184.03
131072 1267.09
262144 1230.76
524288 1246.92
1048576 1255.88
2097152 1274.54
4194304 1281.63
_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19288.php