George Bosilca wrote: > Yes, in Open MPI the connections are usually created on demand. As far > as I know there are few devices that do not abide to this "law", but > MX is not one of them. > > To be more precise on how the connections are established, if we say > that each node has two rails and we're doing a ping-pong, the first > message from p0 to p1 will connect the first NIC, and the second > message the second NIC (here I made the assumption that both network > are similar). Moreover in MX, the connection is not symmetric, so your > (1) and (2) might happens simultaneously.
Ok. I still don't see why I couldn't reproduce the problem with MX when the progression thread was disabled. But I found a way to work-around the problem in Open-MX so we should be good now. > Does the code contain an MPI_Barrier ? If yes, this might be why you > see the sequence (1), (2), (3) and (4) ... It was hanging during startup in the Intel MPI Benchmarks. Looks like MPI_Comm_split() in IMB_set_communicator() was causing the problem. thanks a lot Brice