I have been looking closely at the source code for openmpi-2.0.2 and I see what looks like support for GPU direct RDMA but when testing it with a small GPU direct aware MPI program from Nvidia, I don't see ibv_reg_mr() ever being called with the GPU UVM address. Looking more closely, it appears that mca_bml_base_prepare_src() calls opal_convertor_pack() which calls MEMCPY_CUDA() so I don't see a case where the GPU memory is accessed peer-to-peer over PCIe by the network card.
Am I missing something? BTW, I am using cuda-8.0 and opal_config.h defines OPAL_CUDA_GDR_SUPPORT, OPAL_CUDA_GET_ATTRIBUTES, OPAL_CUDA_SUPPORT, and OPAL_CUDA_SYNC_MEMOPS as "1".
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel