Hi, We’ve just been running some OSU benchmarks with OpenMPI 3.0.0 and noticed that osu_bibw gives nowhere near the bandwidth I’d expect (this is on FDR IB). However, osu_bw is fine.
If I disable eager RDMA, then osu_bibw gives the expected numbers. Similarly, if I increase the number of eager RDMA buffers, it gives the expected results. OpenMPI 1.10.7 gives consistent, reasonable numbers with default settings, but they’re not as good as 3.0.0 (when tuned) for large buffers. The same option changes produce no different in the performance for 1.10.7. I was wondering if anyone else has noticed anything similar, and if this is unexpected, if anyone has a suggestion on how to investigate further? Thanks, Ben Here’s are the numbers: 3.0.0, osu_bw, default settings > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bw # OSU MPI Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 1 1.13 2 2.29 4 4.63 8 9.21 16 18.18 32 36.46 64 69.95 128 128.55 256 250.74 512 451.54 1024 829.44 2048 1475.87 4096 2119.99 8192 3452.37 16384 2866.51 32768 4048.17 65536 5030.54 131072 5573.81 262144 5861.61 524288 6015.15 1048576 6099.46 2097152 989.82 4194304 989.81 3.0.0, osu_bibw, default settings > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bibw # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 1 0.00 2 0.01 4 0.01 8 0.02 16 0.04 32 0.09 64 0.16 128 135.30 256 265.35 512 499.92 1024 949.22 2048 1440.27 4096 1960.09 8192 3166.97 16384 127.62 32768 165.12 65536 312.80 131072 1120.03 262144 4724.01 524288 4545.93 1048576 5186.51 2097152 989.84 4194304 989.88 3.0.0, osu_bibw, eager RDMA disabled > mpirun -mca btl_openib_use_eager_rdma 0 -map-by ppr:1:node -np 2 -H r6,r7 > ./osu_bibw # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 1 1.49 2 2.97 4 5.96 8 11.98 16 23.95 32 47.39 64 93.57 128 153.82 256 304.69 512 572.30 1024 1003.52 2048 1083.89 4096 1879.32 8192 2785.18 16384 3535.77 32768 5614.72 65536 8113.69 131072 9666.74 262144 10738.97 524288 11247.02 1048576 11416.50 2097152 989.88 4194304 989.88 3.0.0, osu_bibw, increased eager RDMA buffer count > mpirun -mca btl_openib_eager_rdma_num 32768 -map-by ppr:1:node -np 2 -H r6,r7 > ./osu_bibw # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 1 1.42 2 2.84 4 5.67 8 11.18 16 22.46 32 44.65 64 83.10 128 154.00 256 291.63 512 537.66 1024 942.35 2048 1433.09 4096 2356.40 8192 1998.54 16384 3584.82 32768 5523.08 65536 7717.63 131072 9419.50 262144 10564.77 524288 11104.71 1048576 11130.75 2097152 7943.89 4194304 5270.00 1.10.7, osu_bibw, default settings > mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bibw # OSU MPI Bi-Directional Bandwidth Test v5.4.0 # Size Bandwidth (MB/s) 1 1.70 2 3.45 4 6.95 8 13.68 16 27.41 32 53.80 64 105.34 128 164.40 256 324.63 512 623.95 1024 1127.35 2048 1784.58 4096 3305.45 8192 3697.55 16384 4935.75 32768 7186.28 65536 8996.94 131072 9301.78 262144 4691.36 524288 7039.18 1048576 7213.33 2097152 9601.41 4194304 9281.31
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users