On Tue, Feb 09, 2010 at 05:01:21PM -0500, Jeff Squyres wrote: > 1. Is this now the recommended way to find all the IP interfaces that support > RDMA: > > - loop over all local IP addresses > - if 127.0.0.1/8, skip > - try to rdma_bind_addr() > - if it succeeds and verbs ptr is != NULL, it's an RDMA device
RDMA is not special, it is just like any other IP service. RDMA is supported on loopback. To find the list of RMDA capable IPs you do the rdma_bind_addr test. You then have to transform that list exactly as you would for TCP to get a list of candidate addresses that could be used for remote connection. This means removing loopback, doing someting about link local addresses, and matching as necessary IPs to interfaces, to networks, and to source IPs on the connecting side. It isn't trivial, but it is exactly the same as for TCP. I suppose ideally OMPI would use the same codes for both TCP and RDMACM - and it should have user configurables! It is worth reviewing what the OMPI TCP does and at least checking that the RDMACM hits all the same points. > 2. Before Sean backed out the localhost behavior, when you > rdma_addr_bind(127.0.0.1), what did the id->verbs pointer > correspond to? One of the RDMA verbs devices in the system. The API does not define which one the kernel will select. I think the current patches simply picked the first one. Jason _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg